{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 简介\n",
    "\n",
    "在上一篇文章中,我们对数据进行了初步的EDA操作,本文我们结合上一篇文章的EDA介绍一种线上线下一致的方案,该方案是我们目前多种建模方案中一个非常重要的组成部分,目前该方案在初期可以拿到线上线下一致的结果,可以帮助大家进行特征的构建。\n",
    "\n",
    "本文我们仅仅给出该方案的大致框架,大家在该框架的基础上进行特征工程的构建,就可以帮助大家拿到线上0.697左右的分数,为了不影响很多赛友,我们不会开源特别高的分数, 此处仅给出多分类方案的一套完整的框架, 大家可以在此基础上进行修改, 祝大家比赛愉快。\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 工具包导入&数据读取\n",
    "## 工具包导入"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import lightgbm as lgb\n",
    "import os\n",
    "\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "from tqdm import tqdm\n",
    "import json \n",
    "from sklearn.metrics import f1_score\n",
    "from sklearn.model_selection import StratifiedKFold\n",
    "from itertools import product\n",
    "import ast\n",
    "# pd.options.display.precision = 15"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2.2.3'"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lgb.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据读取"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = '/data/Data_JieZhang/KDD/'\n",
    "\n",
    "train_queries = pd.read_csv(path + 'train_queries.csv', parse_dates=['req_time'])\n",
    "train_plans   = pd.read_csv(path + 'train_plans.csv', parse_dates=['plan_time'])\n",
    "train_clicks  = pd.read_csv(path + 'train_clicks.csv')\n",
    "profiles      = pd.read_csv(path + 'profiles.csv') \n",
    "\n",
    "test_queries  = pd.read_csv(path + 'test_queries.csv', parse_dates=['req_time'])\n",
    "test_plans    = pd.read_csv(path + 'test_plans.csv', parse_dates=['plan_time'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 特征工程\n",
    "\n",
    "- 此处我们对所有表格进行合并,这样方便提取表格之间的交互特征,注意因为初赛的数据相对较少,所以我们才合在一起,不然尽量不要做,这样会给机器的内存带来非常大的负担.\n",
    "\n",
    "## 数据集合并\n",
    " "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "train = train_queries.merge(train_plans, 'left', ['sid'])\n",
    "test  = test_queries.merge(test_plans, 'left', ['sid'])\n",
    "\n",
    "train = train.merge(train_clicks, 'left', ['sid'])\n",
    "train['click_mode'] = train['click_mode'].fillna(0).astype(int)\n",
    "data  = pd.concat([train, test], ignore_index=True)\n",
    "\n",
    "data  = data.merge(profiles, 'left', ['pid']) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## od(经纬度)特征\n",
    "- 因为经纬度是组合字符串特征,此处我们对其进行还原,因为o,d本身是有相对大小关系的,我们不再对其进行编码。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "data['o_lng'] = data['o'].apply(lambda x: float(x.split(',')[0]))\n",
    "data['o_lat'] = data['o'].apply(lambda x: float(x.split(',')[1]))\n",
    "data['d_lng'] = data['d'].apply(lambda x: float(x.split(',')[0]))\n",
    "data['d_lat'] = data['d'].apply(lambda x: float(x.split(',')[1])) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## plan_time & req_time特征\n",
    "### 原始特征\n",
    "时间信息会影响我们的决定,比如大晚上从A地到B地其实很多人是不会选择步行的,更多的会选择打车之类的,因为太黑了,怕迷路等;而如果是早高峰,而且离公司就几公里的情况, 那么一般就不会打车，因为特别会容易堵车,这个时候大家更喜欢骑自行车.\n",
    "\n",
    "- 此处我们提取weekday来标志是周几; hour来标志是当日几点.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "time_feature = []\n",
    "for i in ['req_time']:\n",
    "    data[i + '_hour'] = data[i].dt.hour\n",
    "    data[i + '_weekday'] = data[i].dt.weekday\n",
    "    time_feature.append(i + '_hour')\n",
    "    time_feature.append(i + '_weekday') "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### plan_time & req_time差值特征\n",
    "\n",
    "我们做EDA的时候发现plan_time和req_time并不是完全一样的,有的有一些时间差,我们猜测可能是手机的网速等问题,所以我们做差值来标志用户的手机信号等信息.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "data['time_diff'] = data['plan_time'].astype(int) - data['req_time'].astype(int)\n",
    "time_feature.append('time_diff')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## plans特征\n",
    "\n",
    "plans这个数据集包含的信息非常多,因为这个信息是基于百度地图推荐的。所以毫无疑问是本次比赛的关键之一,我们对其进行展开并提取相关的特征。\n",
    "\n",
    "此处我发现kdd已经有大佬开源了plans的特征提取关键代码,我感觉很不错,此处我便直接引用,至于其他的特征欢迎去作者的Github下载.\n",
    "\n",
    "此处关于plans的特征主要可以归纳为如下的特征:\n",
    "\n",
    "1. 百度地图推荐的距离的统计值(mean,min,max,std)\n",
    "2. 各种交通方式的价格的统计值(mean,min,max,std)\n",
    "3. 各种交通方式的时间的统计值(mean,min,max,std)\n",
    "4. 一些其他的特征,最大距离的交通方式,最高价格的交通方式,最短时间的交通方式等."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "data['plans_json'] = data['plans'].fillna('[]').apply(lambda x: json.loads(x))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "rank_feature      =  []\n",
    "for i in range(data['plans_json'].apply(len).max()):\n",
    "    rank_feature.append('price_' + str(i))\n",
    "    rank_feature.append('eta_' + str(i))\n",
    "    rank_feature.append('distance_' + str(i)) \n",
    "    data['rank_' + str(i)]     = data['plans_json'].apply(lambda x: x[i]['transport_mode'] if len(x) > i else None)\n",
    "    data['price_' + str(i)]    = data['plans_json'].apply(lambda x: x[i]['price'] if len(x) > i else None)\n",
    "    data['price_' + str(i)]    = data['price_' + str(i)].apply(lambda x: 0 if x == ''or x is None else float(x))\n",
    "    data['eta_' + str(i)]      = data['plans_json'].apply(lambda x: x[i]['eta'] if len(x) > i else None)\n",
    "    data['distance_' + str(i)] = data['plans_json'].apply(lambda x: x[i]['distance'] if len(x) > i else None)\n",
    "    if i == 0:\n",
    "        continue\n",
    "    rank_feature.append('rank_' + str(i))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def gen_plan_feas(data):\n",
    "    n                                           = data.shape[0]\n",
    "    mode_list_feas                              = np.zeros((n, 12))\n",
    "    max_dist, min_dist, mean_dist, std_dist     = np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,))\n",
    "\n",
    "    max_price, min_price, mean_price, std_price = np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,))\n",
    "\n",
    "    max_eta, min_eta, mean_eta, std_eta         = np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,))\n",
    "\n",
    "    min_dist_mode, max_dist_mode, min_price_mode, max_price_mode, min_eta_mode, max_eta_mode, first_mode = \\\n",
    "    np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,)), np.zeros((n,))\n",
    "  \n",
    "    mode_texts = []\n",
    "    for i, plan in tqdm(enumerate(data['plans_json'].values)):\n",
    "        if len(plan) == 0:\n",
    "            cur_plan_list   = []\n",
    "        else:\n",
    "            cur_plan_list   = plan\n",
    "        if len(cur_plan_list) == 0:\n",
    "            mode_list_feas[i, 0] =  1\n",
    "            first_mode[i]        =  0\n",
    "\n",
    "            max_dist[i]          = -1\n",
    "            min_dist[i]          = -1\n",
    "            mean_dist[i]         = -1\n",
    "            std_dist[i]          = -1\n",
    "\n",
    "            max_price[i]         = -1\n",
    "            min_price[i]         = -1\n",
    "            mean_price[i]        = -1\n",
    "            std_price[i]         = -1\n",
    "\n",
    "            max_eta[i]           = -1\n",
    "            min_eta[i]           = -1\n",
    "            mean_eta[i]          = -1\n",
    "            std_eta[i]           = -1\n",
    "\n",
    "            min_dist_mode[i]     = -1\n",
    "            max_dist_mode[i]     = -1\n",
    "            min_price_mode[i]    = -1\n",
    "            max_price_mode[i]    = -1\n",
    "            min_eta_mode[i]      = -1\n",
    "            max_eta_mode[i]      = -1\n",
    "\n",
    "            mode_texts.append('word_null')\n",
    "        else:\n",
    "            distance_list = []\n",
    "            price_list = []\n",
    "            eta_list = []\n",
    "            mode_list = []\n",
    "            for tmp_dit in cur_plan_list:\n",
    "                distance_list.append(int(tmp_dit['distance']))\n",
    "                if tmp_dit['price'] == '':\n",
    "                    price_list.append(0)\n",
    "                else:\n",
    "                    price_list.append(int(tmp_dit['price']))\n",
    "                eta_list.append(int(tmp_dit['eta']))\n",
    "                mode_list.append(int(tmp_dit['transport_mode']))\n",
    "            mode_texts.append(\n",
    "                ' '.join(['word_{}'.format(mode) for mode in mode_list]))\n",
    "            distance_list                = np.array(distance_list)\n",
    "            price_list                   = np.array(price_list)\n",
    "            eta_list                     = np.array(eta_list)\n",
    "            mode_list                    = np.array(mode_list, dtype='int')\n",
    "            mode_list_feas[i, mode_list] = 1\n",
    "            distance_sort_idx            = np.argsort(distance_list)\n",
    "            price_sort_idx               = np.argsort(price_list)\n",
    "            eta_sort_idx                 = np.argsort(eta_list)\n",
    "\n",
    "            max_dist[i]                  = distance_list[distance_sort_idx[-1]]\n",
    "            min_dist[i]                  = distance_list[distance_sort_idx[0]]\n",
    "            mean_dist[i]                 = np.mean(distance_list)\n",
    "            std_dist[i]                  = np.std(distance_list)\n",
    "\n",
    "            max_price[i]                 = price_list[price_sort_idx[-1]]\n",
    "            min_price[i]                 = price_list[price_sort_idx[0]]\n",
    "            mean_price[i]                = np.mean(price_list)\n",
    "            std_price[i]                 = np.std(price_list)\n",
    "\n",
    "            max_eta[i]                   = eta_list[eta_sort_idx[-1]]\n",
    "            min_eta[i]                   = eta_list[eta_sort_idx[0]]\n",
    "            mean_eta[i]                  = np.mean(eta_list)\n",
    "            std_eta[i]                   = np.std(eta_list)\n",
    "\n",
    "            first_mode[i]                = mode_list[0]\n",
    "            max_dist_mode[i]             = mode_list[distance_sort_idx[-1]]\n",
    "            min_dist_mode[i]             = mode_list[distance_sort_idx[0]]\n",
    "\n",
    "            max_price_mode[i]            = mode_list[price_sort_idx[-1]]\n",
    "            min_price_mode[i]            = mode_list[price_sort_idx[0]]\n",
    "\n",
    "            max_eta_mode[i]              = mode_list[eta_sort_idx[-1]]\n",
    "            min_eta_mode[i]              = mode_list[eta_sort_idx[0]]\n",
    "\n",
    "    feature_data                   =  pd.DataFrame(mode_list_feas)\n",
    "    feature_data.columns           =  ['mode_feas_{}'.format(i) for i in range(12)]\n",
    "    feature_data['max_dist']       =  max_dist\n",
    "    feature_data['min_dist']       =  min_dist\n",
    "    feature_data['mean_dist']      =  mean_dist\n",
    "    feature_data['std_dist']       =  std_dist\n",
    "\n",
    "    feature_data['max_price']      = max_price\n",
    "    feature_data['min_price']      = min_price\n",
    "    feature_data['mean_price']     = mean_price\n",
    "    feature_data['std_price']      = std_price\n",
    "\n",
    "    feature_data['max_eta']        = max_eta\n",
    "    feature_data['min_eta']        = min_eta\n",
    "    feature_data['mean_eta']       = mean_eta\n",
    "    feature_data['std_eta']        = std_eta\n",
    "\n",
    "    feature_data['max_dist_mode']  = max_dist_mode\n",
    "    feature_data['min_dist_mode']  = min_dist_mode\n",
    "    feature_data['max_price_mode'] = max_price_mode\n",
    "    feature_data['min_price_mode'] = min_price_mode\n",
    "    feature_data['max_eta_mode']   = max_eta_mode\n",
    "    feature_data['min_eta_mode']   = min_eta_mode\n",
    "    feature_data['first_mode']     = first_mode\n",
    "    print('mode tfidf...')\n",
    "    tfidf_enc = TfidfVectorizer(ngram_range=(1, 2))\n",
    "    tfidf_vec = tfidf_enc.fit_transform(mode_texts)\n",
    "    svd_enc = TruncatedSVD(n_components=10, n_iter=20, random_state=2019)\n",
    "    mode_svd = svd_enc.fit_transform(tfidf_vec)\n",
    "    mode_svd = pd.DataFrame(mode_svd)\n",
    "    mode_svd.columns = ['svd_mode_{}'.format(i) for i in range(10)]\n",
    "\n",
    "    plan_fea = pd.concat([feature_data, mode_svd], axis=1)\n",
    "    plan_fea['sid'] = data['sid'].values\n",
    "    return plan_fea"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import TfidfVectorizer\n",
    "from sklearn.decomposition import TruncatedSVD"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "594358it [01:24, 7032.82it/s]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mode tfidf...\n"
     ]
    }
   ],
   "source": [
    "data_plans = gen_plan_feas(data)\n",
    "plan_features = [col for col in data_plans.columns if col not in ['sid']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = data.merge(data_plans, on='sid', how='left')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 模型训练&验证&提交\n",
    "## 模型训练&验证\n",
    "\n",
    "### 评估指标设计\n",
    "- 为了对线上线下有一定的了解,我们尽可能设计和线上一样的评估,下面是lgb的评估函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f1_weighted(labels,preds):\n",
    "    preds = np.argmax(preds.reshape(12, -1), axis=0)\n",
    "    score = f1_score(y_true=labels, y_pred=preds, average='weighted')\n",
    "    return 'f1_weighted', score, True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 模型验证\n",
    "- 此处我们模拟线上,选用7天的时间作为验证集."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "data['pid'] = data['pid'].fillna(-1).astype(int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "121 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff', 'rank_1', 'rank_2', 'rank_3', 'rank_4', 'rank_5', 'rank_6']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.628292\n",
      "[20]\tvalid_0's f1_weighted: 0.676445\n",
      "[30]\tvalid_0's f1_weighted: 0.682759\n",
      "[40]\tvalid_0's f1_weighted: 0.684067\n",
      "[50]\tvalid_0's f1_weighted: 0.685105\n",
      "[60]\tvalid_0's f1_weighted: 0.685842\n",
      "[70]\tvalid_0's f1_weighted: 0.686488\n",
      "[80]\tvalid_0's f1_weighted: 0.686845\n",
      "[90]\tvalid_0's f1_weighted: 0.687268\n",
      "[100]\tvalid_0's f1_weighted: 0.687988\n",
      "[110]\tvalid_0's f1_weighted: 0.688154\n",
      "[120]\tvalid_0's f1_weighted: 0.688158\n",
      "[130]\tvalid_0's f1_weighted: 0.688491\n",
      "[140]\tvalid_0's f1_weighted: 0.688697\n",
      "[150]\tvalid_0's f1_weighted: 0.688873\n",
      "[160]\tvalid_0's f1_weighted: 0.688811\n",
      "[170]\tvalid_0's f1_weighted: 0.6888\n",
      "[180]\tvalid_0's f1_weighted: 0.688903\n",
      "[190]\tvalid_0's f1_weighted: 0.688801\n",
      "[200]\tvalid_0's f1_weighted: 0.688857\n",
      "[210]\tvalid_0's f1_weighted: 0.688853\n",
      "[220]\tvalid_0's f1_weighted: 0.689039\n",
      "[230]\tvalid_0's f1_weighted: 0.688969\n",
      "[240]\tvalid_0's f1_weighted: 0.689045\n",
      "[250]\tvalid_0's f1_weighted: 0.688953\n",
      "[260]\tvalid_0's f1_weighted: 0.68903\n",
      "[270]\tvalid_0's f1_weighted: 0.689025\n",
      "[280]\tvalid_0's f1_weighted: 0.689031\n",
      "[290]\tvalid_0's f1_weighted: 0.68888\n",
      "[300]\tvalid_0's f1_weighted: 0.689027\n",
      "[310]\tvalid_0's f1_weighted: 0.688924\n",
      "[320]\tvalid_0's f1_weighted: 0.688813\n",
      "[330]\tvalid_0's f1_weighted: 0.689013\n",
      "[340]\tvalid_0's f1_weighted: 0.689144\n",
      "[350]\tvalid_0's f1_weighted: 0.689095\n",
      "[360]\tvalid_0's f1_weighted: 0.689068\n",
      "[370]\tvalid_0's f1_weighted: 0.688896\n",
      "[380]\tvalid_0's f1_weighted: 0.688992\n",
      "[390]\tvalid_0's f1_weighted: 0.688995\n",
      "[400]\tvalid_0's f1_weighted: 0.688924\n",
      "Early stopping, best iteration is:\n",
      "[306]\tvalid_0's f1_weighted: 0.689214\n",
      "CPU times: user 3h 36min 19s, sys: 18.5 s, total: 3h 36min 38s\n",
      "Wall time: 4min 34s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature + rank_feature\n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "test_index = (data.req_time > '2018-12-01')\n",
    "test_x     = data[test_index][feature].reset_index(drop=True)\n",
    "\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',\n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,verbose=10, early_stopping_rounds=100)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "S = 0.014053913260960742 #0.01625252\n",
    "A = 63388 #94358 #92571"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5541.0"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.sqrt(((A*S) ** 2 / 8 + A**2 * S) / 2 ) + (A*S/4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.016252522441261897"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "Num0      = 8898\n",
    "A         = 94358\n",
    "Precision = Num0 / A\n",
    "Recall    =  1\n",
    " \n",
    "w0        = Num0 / A\n",
    "\n",
    "w0 * (2 * Precision * Recall) / ( Precision + Recall) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = pd.read_csv('/data/Data_JieZhang/KDD/test_queries.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(94358, 5)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "data['click_time'] = pd.to_datetime(data['click_time'])\n",
    "data['time_delay'] = (data['click_time'] - data['req_time']).dt.seconds "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [],
   "source": [
    "data['time_delay'] = data['time_delay'].fillna(data['time_delay'].max() + 100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {},
   "outputs": [],
   "source": [
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature \n",
    " \n",
    "train_index = (data.req_time < '2018-11-23') # &  (data.click_time.isnull() == False)\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].time_delay.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01') # & (data.plan_time.isnull() == False)\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].time_delay.reset_index(drop=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10733"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.plan_time.isnull().sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "FOLD: \n",
      "87323 349289\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's l1: 7590.04\n",
      "[20]\tvalid_0's l1: 7225.84\n",
      "[30]\tvalid_0's l1: 7007.8\n",
      "[40]\tvalid_0's l1: 6877.26\n",
      "[50]\tvalid_0's l1: 6799.1\n",
      "[60]\tvalid_0's l1: 6752.3\n",
      "[70]\tvalid_0's l1: 6724.28\n",
      "[80]\tvalid_0's l1: 6707.5\n",
      "[90]\tvalid_0's l1: 6697.45\n",
      "[100]\tvalid_0's l1: 6691.43\n",
      "[110]\tvalid_0's l1: 6687.83\n",
      "[120]\tvalid_0's l1: 6685.67\n",
      "[130]\tvalid_0's l1: 6684.38\n",
      "[140]\tvalid_0's l1: 6683.6\n",
      "[150]\tvalid_0's l1: 6683.14\n",
      "[160]\tvalid_0's l1: 6682.86\n",
      "[170]\tvalid_0's l1: 6682.69\n",
      "[180]\tvalid_0's l1: 6682.59\n",
      "[190]\tvalid_0's l1: 6682.53\n",
      "[200]\tvalid_0's l1: 6682.5\n",
      "[210]\tvalid_0's l1: 6682.48\n",
      "[220]\tvalid_0's l1: 6682.46\n",
      "[230]\tvalid_0's l1: 6682.45\n",
      "[240]\tvalid_0's l1: 6682.45\n",
      "[250]\tvalid_0's l1: 6682.44\n",
      "[260]\tvalid_0's l1: 6682.44\n",
      "[270]\tvalid_0's l1: 6682.44\n",
      "[280]\tvalid_0's l1: 6682.44\n",
      "[290]\tvalid_0's l1: 6682.44\n",
      "[300]\tvalid_0's l1: 6682.44\n",
      "[310]\tvalid_0's l1: 6682.44\n",
      "[320]\tvalid_0's l1: 6682.44\n",
      "[330]\tvalid_0's l1: 6682.44\n",
      "[340]\tvalid_0's l1: 6682.43\n",
      "[350]\tvalid_0's l1: 6682.43\n",
      "[360]\tvalid_0's l1: 6682.43\n",
      "[370]\tvalid_0's l1: 6682.43\n",
      "[380]\tvalid_0's l1: 6682.43\n",
      "[390]\tvalid_0's l1: 6682.43\n",
      "[400]\tvalid_0's l1: 6682.43\n",
      "[410]\tvalid_0's l1: 6682.43\n",
      "[420]\tvalid_0's l1: 6682.43\n",
      "[430]\tvalid_0's l1: 6682.43\n",
      "[440]\tvalid_0's l1: 6682.43\n",
      "[450]\tvalid_0's l1: 6682.43\n",
      "[460]\tvalid_0's l1: 6682.43\n",
      "[470]\tvalid_0's l1: 6682.43\n",
      "[480]\tvalid_0's l1: 6682.43\n",
      "[490]\tvalid_0's l1: 6682.43\n",
      "[500]\tvalid_0's l1: 6682.43\n",
      "[510]\tvalid_0's l1: 6682.43\n",
      "[520]\tvalid_0's l1: 6682.43\n",
      "[530]\tvalid_0's l1: 6682.43\n",
      "[540]\tvalid_0's l1: 6682.43\n",
      "[550]\tvalid_0's l1: 6682.43\n",
      "[560]\tvalid_0's l1: 6682.43\n",
      "[570]\tvalid_0's l1: 6682.43\n",
      "[580]\tvalid_0's l1: 6682.43\n",
      "[590]\tvalid_0's l1: 6682.43\n",
      "[600]\tvalid_0's l1: 6682.43\n",
      "[610]\tvalid_0's l1: 6682.43\n",
      "[620]\tvalid_0's l1: 6682.43\n",
      "[630]\tvalid_0's l1: 6682.43\n",
      "[640]\tvalid_0's l1: 6682.43\n",
      "[650]\tvalid_0's l1: 6682.43\n",
      "Early stopping, best iteration is:\n",
      "[552]\tvalid_0's l1: 6682.43\n",
      "FOLD: \n",
      "87323 349289\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's l1: 7781.66\n",
      "[20]\tvalid_0's l1: 7413.9\n",
      "[30]\tvalid_0's l1: 7193.73\n",
      "[40]\tvalid_0's l1: 7061.91\n",
      "[50]\tvalid_0's l1: 6982.99\n",
      "[60]\tvalid_0's l1: 6935.73\n",
      "[70]\tvalid_0's l1: 6907.44\n",
      "[80]\tvalid_0's l1: 6890.49\n",
      "[90]\tvalid_0's l1: 6880.35\n",
      "[100]\tvalid_0's l1: 6874.27\n",
      "[110]\tvalid_0's l1: 6870.63\n",
      "[120]\tvalid_0's l1: 6868.45\n",
      "[130]\tvalid_0's l1: 6867.15\n",
      "[140]\tvalid_0's l1: 6866.36\n",
      "[150]\tvalid_0's l1: 6865.9\n",
      "[160]\tvalid_0's l1: 6865.61\n",
      "[170]\tvalid_0's l1: 6865.45\n",
      "[180]\tvalid_0's l1: 6865.34\n",
      "[190]\tvalid_0's l1: 6865.28\n",
      "[200]\tvalid_0's l1: 6865.25\n",
      "[210]\tvalid_0's l1: 6865.22\n",
      "[220]\tvalid_0's l1: 6865.21\n",
      "[230]\tvalid_0's l1: 6865.2\n",
      "[240]\tvalid_0's l1: 6865.2\n",
      "[250]\tvalid_0's l1: 6865.19\n",
      "[260]\tvalid_0's l1: 6865.19\n",
      "[270]\tvalid_0's l1: 6865.19\n",
      "[280]\tvalid_0's l1: 6865.19\n",
      "[290]\tvalid_0's l1: 6865.19\n",
      "[300]\tvalid_0's l1: 6865.19\n",
      "[310]\tvalid_0's l1: 6865.19\n",
      "[320]\tvalid_0's l1: 6865.19\n",
      "[330]\tvalid_0's l1: 6865.19\n",
      "[340]\tvalid_0's l1: 6865.19\n",
      "[350]\tvalid_0's l1: 6865.19\n",
      "[360]\tvalid_0's l1: 6865.19\n",
      "[370]\tvalid_0's l1: 6865.19\n",
      "[380]\tvalid_0's l1: 6865.19\n",
      "[390]\tvalid_0's l1: 6865.19\n",
      "[400]\tvalid_0's l1: 6865.18\n",
      "[410]\tvalid_0's l1: 6865.18\n",
      "[420]\tvalid_0's l1: 6865.18\n",
      "[430]\tvalid_0's l1: 6865.18\n",
      "[440]\tvalid_0's l1: 6865.18\n",
      "[450]\tvalid_0's l1: 6865.18\n",
      "[460]\tvalid_0's l1: 6865.18\n",
      "[470]\tvalid_0's l1: 6865.18\n",
      "[480]\tvalid_0's l1: 6865.18\n",
      "[490]\tvalid_0's l1: 6865.18\n",
      "[500]\tvalid_0's l1: 6865.18\n",
      "[510]\tvalid_0's l1: 6865.18\n",
      "[520]\tvalid_0's l1: 6865.18\n",
      "[530]\tvalid_0's l1: 6865.18\n",
      "[540]\tvalid_0's l1: 6865.18\n",
      "[550]\tvalid_0's l1: 6865.18\n",
      "[560]\tvalid_0's l1: 6865.18\n",
      "[570]\tvalid_0's l1: 6865.18\n",
      "[580]\tvalid_0's l1: 6865.18\n",
      "[590]\tvalid_0's l1: 6865.18\n",
      "[600]\tvalid_0's l1: 6865.18\n",
      "[610]\tvalid_0's l1: 6865.18\n",
      "[620]\tvalid_0's l1: 6865.18\n",
      "[630]\tvalid_0's l1: 6865.18\n",
      "[640]\tvalid_0's l1: 6865.18\n",
      "[650]\tvalid_0's l1: 6865.18\n",
      "Early stopping, best iteration is:\n",
      "[552]\tvalid_0's l1: 6865.18\n",
      "FOLD: \n",
      "87322 349290\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's l1: 7644.22\n",
      "[20]\tvalid_0's l1: 7268.37\n",
      "[30]\tvalid_0's l1: 7043.36\n",
      "[40]\tvalid_0's l1: 6908.64\n",
      "[50]\tvalid_0's l1: 6827.98\n",
      "[60]\tvalid_0's l1: 6779.68\n",
      "[70]\tvalid_0's l1: 6750.77\n",
      "[80]\tvalid_0's l1: 6733.45\n",
      "[90]\tvalid_0's l1: 6723.08\n",
      "[100]\tvalid_0's l1: 6716.87\n",
      "[110]\tvalid_0's l1: 6713.15\n",
      "[120]\tvalid_0's l1: 6710.93\n",
      "[130]\tvalid_0's l1: 6709.59\n",
      "[140]\tvalid_0's l1: 6708.79\n",
      "[150]\tvalid_0's l1: 6708.32\n",
      "[160]\tvalid_0's l1: 6708.03\n",
      "[170]\tvalid_0's l1: 6707.86\n",
      "[180]\tvalid_0's l1: 6707.75\n",
      "[190]\tvalid_0's l1: 6707.69\n",
      "[200]\tvalid_0's l1: 6707.65\n",
      "[210]\tvalid_0's l1: 6707.63\n",
      "[220]\tvalid_0's l1: 6707.62\n",
      "[230]\tvalid_0's l1: 6707.61\n",
      "[240]\tvalid_0's l1: 6707.6\n",
      "[250]\tvalid_0's l1: 6707.6\n",
      "[260]\tvalid_0's l1: 6707.6\n",
      "[270]\tvalid_0's l1: 6707.6\n",
      "[280]\tvalid_0's l1: 6707.6\n",
      "[290]\tvalid_0's l1: 6707.59\n",
      "[300]\tvalid_0's l1: 6707.59\n",
      "[310]\tvalid_0's l1: 6707.59\n",
      "[320]\tvalid_0's l1: 6707.59\n",
      "[330]\tvalid_0's l1: 6707.59\n",
      "[340]\tvalid_0's l1: 6707.59\n",
      "[350]\tvalid_0's l1: 6707.59\n",
      "[360]\tvalid_0's l1: 6707.59\n",
      "[370]\tvalid_0's l1: 6707.59\n",
      "[380]\tvalid_0's l1: 6707.59\n",
      "[390]\tvalid_0's l1: 6707.59\n",
      "[400]\tvalid_0's l1: 6707.59\n",
      "[410]\tvalid_0's l1: 6707.59\n",
      "[420]\tvalid_0's l1: 6707.59\n",
      "[430]\tvalid_0's l1: 6707.59\n",
      "[440]\tvalid_0's l1: 6707.59\n",
      "[450]\tvalid_0's l1: 6707.59\n",
      "[460]\tvalid_0's l1: 6707.59\n",
      "[470]\tvalid_0's l1: 6707.59\n",
      "[480]\tvalid_0's l1: 6707.59\n",
      "[490]\tvalid_0's l1: 6707.59\n",
      "[500]\tvalid_0's l1: 6707.59\n",
      "[510]\tvalid_0's l1: 6707.59\n",
      "[520]\tvalid_0's l1: 6707.59\n",
      "[530]\tvalid_0's l1: 6707.59\n",
      "[540]\tvalid_0's l1: 6707.59\n",
      "[550]\tvalid_0's l1: 6707.59\n",
      "[560]\tvalid_0's l1: 6707.59\n",
      "[570]\tvalid_0's l1: 6707.59\n",
      "[580]\tvalid_0's l1: 6707.59\n",
      "[590]\tvalid_0's l1: 6707.59\n",
      "[600]\tvalid_0's l1: 6707.59\n",
      "[610]\tvalid_0's l1: 6707.59\n",
      "[620]\tvalid_0's l1: 6707.59\n",
      "[630]\tvalid_0's l1: 6707.59\n",
      "[640]\tvalid_0's l1: 6707.59\n",
      "[650]\tvalid_0's l1: 6707.59\n",
      "[660]\tvalid_0's l1: 6707.59\n",
      "[670]\tvalid_0's l1: 6707.59\n",
      "[680]\tvalid_0's l1: 6707.59\n",
      "[690]\tvalid_0's l1: 6707.59\n",
      "[700]\tvalid_0's l1: 6707.59\n",
      "Early stopping, best iteration is:\n",
      "[601]\tvalid_0's l1: 6707.59\n",
      "FOLD: \n",
      "87322 349290\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's l1: 7551.97\n",
      "[20]\tvalid_0's l1: 7179.93\n",
      "[30]\tvalid_0's l1: 6957.19\n",
      "[40]\tvalid_0's l1: 6823.84\n",
      "[50]\tvalid_0's l1: 6743.99\n",
      "[60]\tvalid_0's l1: 6696.19\n",
      "[70]\tvalid_0's l1: 6667.56\n",
      "[80]\tvalid_0's l1: 6650.42\n",
      "[90]\tvalid_0's l1: 6640.16\n",
      "[100]\tvalid_0's l1: 6634.01\n",
      "[110]\tvalid_0's l1: 6630.33\n",
      "[120]\tvalid_0's l1: 6628.13\n",
      "[130]\tvalid_0's l1: 6626.81\n",
      "[140]\tvalid_0's l1: 6626.02\n",
      "[150]\tvalid_0's l1: 6625.54\n",
      "[160]\tvalid_0's l1: 6625.26\n",
      "[170]\tvalid_0's l1: 6625.09\n",
      "[180]\tvalid_0's l1: 6624.98\n",
      "[190]\tvalid_0's l1: 6624.92\n",
      "[200]\tvalid_0's l1: 6624.89\n",
      "[210]\tvalid_0's l1: 6624.86\n",
      "[220]\tvalid_0's l1: 6624.85\n",
      "[230]\tvalid_0's l1: 6624.84\n",
      "[240]\tvalid_0's l1: 6624.84\n",
      "[250]\tvalid_0's l1: 6624.83\n",
      "[260]\tvalid_0's l1: 6624.83\n",
      "[270]\tvalid_0's l1: 6624.83\n",
      "[280]\tvalid_0's l1: 6624.83\n",
      "[290]\tvalid_0's l1: 6624.83\n",
      "[300]\tvalid_0's l1: 6624.83\n",
      "[310]\tvalid_0's l1: 6624.83\n",
      "[320]\tvalid_0's l1: 6624.83\n",
      "[330]\tvalid_0's l1: 6624.83\n",
      "[340]\tvalid_0's l1: 6624.83\n",
      "[350]\tvalid_0's l1: 6624.83\n",
      "[360]\tvalid_0's l1: 6624.83\n",
      "[370]\tvalid_0's l1: 6624.83\n",
      "[380]\tvalid_0's l1: 6624.83\n",
      "[390]\tvalid_0's l1: 6624.83\n",
      "[400]\tvalid_0's l1: 6624.83\n",
      "[410]\tvalid_0's l1: 6624.83\n",
      "[420]\tvalid_0's l1: 6624.83\n",
      "[430]\tvalid_0's l1: 6624.82\n",
      "[440]\tvalid_0's l1: 6624.82\n",
      "[450]\tvalid_0's l1: 6624.82\n",
      "[460]\tvalid_0's l1: 6624.82\n",
      "[470]\tvalid_0's l1: 6624.82\n",
      "[480]\tvalid_0's l1: 6624.82\n",
      "[490]\tvalid_0's l1: 6624.82\n",
      "[500]\tvalid_0's l1: 6624.82\n",
      "[510]\tvalid_0's l1: 6624.82\n",
      "[520]\tvalid_0's l1: 6624.82\n",
      "[530]\tvalid_0's l1: 6624.82\n",
      "[540]\tvalid_0's l1: 6624.82\n",
      "[550]\tvalid_0's l1: 6624.82\n",
      "[560]\tvalid_0's l1: 6624.82\n",
      "[570]\tvalid_0's l1: 6624.82\n",
      "[580]\tvalid_0's l1: 6624.82\n",
      "[590]\tvalid_0's l1: 6624.82\n",
      "[600]\tvalid_0's l1: 6624.82\n",
      "[610]\tvalid_0's l1: 6624.82\n",
      "[620]\tvalid_0's l1: 6624.82\n",
      "[630]\tvalid_0's l1: 6624.82\n",
      "[640]\tvalid_0's l1: 6624.82\n",
      "[650]\tvalid_0's l1: 6624.82\n",
      "[660]\tvalid_0's l1: 6624.82\n",
      "[670]\tvalid_0's l1: 6624.82\n",
      "[680]\tvalid_0's l1: 6624.82\n",
      "[690]\tvalid_0's l1: 6624.82\n",
      "Early stopping, best iteration is:\n",
      "[590]\tvalid_0's l1: 6624.82\n",
      "FOLD: \n",
      "87322 349290\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's l1: 7540.66\n",
      "[20]\tvalid_0's l1: 7167.91\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[30]\tvalid_0's l1: 6944.75\n",
      "[40]\tvalid_0's l1: 6811.14\n",
      "[50]\tvalid_0's l1: 6731.14\n",
      "[60]\tvalid_0's l1: 6683.25\n",
      "[70]\tvalid_0's l1: 6654.57\n",
      "[80]\tvalid_0's l1: 6637.39\n",
      "[90]\tvalid_0's l1: 6627.11\n",
      "[100]\tvalid_0's l1: 6620.95\n",
      "[110]\tvalid_0's l1: 6617.27\n",
      "[120]\tvalid_0's l1: 6615.06\n",
      "[130]\tvalid_0's l1: 6613.74\n",
      "[140]\tvalid_0's l1: 6612.94\n",
      "[150]\tvalid_0's l1: 6612.47\n",
      "[160]\tvalid_0's l1: 6612.18\n",
      "[170]\tvalid_0's l1: 6612.01\n",
      "[180]\tvalid_0's l1: 6611.91\n",
      "[190]\tvalid_0's l1: 6611.85\n",
      "[200]\tvalid_0's l1: 6611.81\n",
      "[210]\tvalid_0's l1: 6611.79\n",
      "[220]\tvalid_0's l1: 6611.78\n",
      "[230]\tvalid_0's l1: 6611.77\n",
      "[240]\tvalid_0's l1: 6611.76\n",
      "[250]\tvalid_0's l1: 6611.76\n",
      "[260]\tvalid_0's l1: 6611.76\n",
      "[270]\tvalid_0's l1: 6611.76\n",
      "[280]\tvalid_0's l1: 6611.76\n",
      "[290]\tvalid_0's l1: 6611.75\n",
      "[300]\tvalid_0's l1: 6611.75\n",
      "[310]\tvalid_0's l1: 6611.75\n",
      "[320]\tvalid_0's l1: 6611.75\n",
      "[330]\tvalid_0's l1: 6611.75\n",
      "[340]\tvalid_0's l1: 6611.75\n",
      "[350]\tvalid_0's l1: 6611.75\n",
      "[360]\tvalid_0's l1: 6611.75\n",
      "[370]\tvalid_0's l1: 6611.75\n",
      "[380]\tvalid_0's l1: 6611.75\n",
      "[390]\tvalid_0's l1: 6611.75\n",
      "[400]\tvalid_0's l1: 6611.75\n",
      "[410]\tvalid_0's l1: 6611.75\n",
      "[420]\tvalid_0's l1: 6611.75\n",
      "[430]\tvalid_0's l1: 6611.75\n",
      "[440]\tvalid_0's l1: 6611.75\n",
      "[450]\tvalid_0's l1: 6611.75\n",
      "[460]\tvalid_0's l1: 6611.75\n",
      "[470]\tvalid_0's l1: 6611.75\n",
      "[480]\tvalid_0's l1: 6611.75\n",
      "[490]\tvalid_0's l1: 6611.75\n",
      "[500]\tvalid_0's l1: 6611.75\n",
      "[510]\tvalid_0's l1: 6611.75\n",
      "[520]\tvalid_0's l1: 6611.75\n",
      "[530]\tvalid_0's l1: 6611.75\n",
      "[540]\tvalid_0's l1: 6611.75\n",
      "[550]\tvalid_0's l1: 6611.75\n",
      "[560]\tvalid_0's l1: 6611.75\n",
      "[570]\tvalid_0's l1: 6611.75\n",
      "[580]\tvalid_0's l1: 6611.75\n",
      "[590]\tvalid_0's l1: 6611.75\n",
      "[600]\tvalid_0's l1: 6611.75\n",
      "[610]\tvalid_0's l1: 6611.75\n",
      "[620]\tvalid_0's l1: 6611.75\n",
      "Early stopping, best iteration is:\n",
      "[522]\tvalid_0's l1: 6611.75\n"
     ]
    }
   ],
   "source": [
    "import os  \n",
    "i = 0\n",
    "meta_train = np.zeros(shape = ((len(tra)),1))\n",
    "from sklearn.model_selection import StratifiedKFold,KFold \n",
    "skf = KFold(n_splits=5, shuffle=True)\n",
    "pred_test = 0\n",
    "for tr_ind,te_ind in skf.split(train_y):\n",
    "    print('FOLD: '.format(i))\n",
    "    print(len(te_ind),len(tr_ind)) \n",
    "    X_train,X_train_label = train_x.iloc[tr_ind],train_y.values[tr_ind]\n",
    "    X_val,X_val_label     = train_x.iloc[te_ind],train_y.values[te_ind]\n",
    "    \n",
    "    model =  lgb.LGBMRegressor(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "        max_depth=-1, n_estimators=1000, objective='mae',\n",
    "        subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "        learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "    eval_set = [(X_val, X_val_label)]\n",
    "    model.fit(X_train, X_train_label, eval_set=eval_set, eval_metric='mae',verbose=10, early_stopping_rounds=100)\n",
    " \n",
    "    pred_val = model.predict(X_val).reshape(-1,1) \n",
    "    pred_test += model.predict(valid_x) / 5\n",
    "    \n",
    "    meta_train[te_ind] = pred_val  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "data.loc[train_index, 'pred_click_time'] = meta_train\n",
    "data.loc[valid_index, 'pred_click_time'] = pred_test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(436612, 116)\n",
      "116 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff', 'pred_click_time']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.626767\n",
      "[20]\tvalid_0's f1_weighted: 0.67514\n",
      "[30]\tvalid_0's f1_weighted: 0.681615\n",
      "[40]\tvalid_0's f1_weighted: 0.683103\n",
      "[50]\tvalid_0's f1_weighted: 0.684118\n",
      "[60]\tvalid_0's f1_weighted: 0.685327\n",
      "[70]\tvalid_0's f1_weighted: 0.686144\n",
      "[80]\tvalid_0's f1_weighted: 0.686766\n",
      "[90]\tvalid_0's f1_weighted: 0.687061\n",
      "[100]\tvalid_0's f1_weighted: 0.687518\n",
      "[110]\tvalid_0's f1_weighted: 0.687584\n",
      "[120]\tvalid_0's f1_weighted: 0.687807\n",
      "[130]\tvalid_0's f1_weighted: 0.687912\n",
      "[140]\tvalid_0's f1_weighted: 0.688032\n",
      "[150]\tvalid_0's f1_weighted: 0.688008\n",
      "[160]\tvalid_0's f1_weighted: 0.68793\n",
      "[170]\tvalid_0's f1_weighted: 0.687986\n",
      "[180]\tvalid_0's f1_weighted: 0.688312\n",
      "[190]\tvalid_0's f1_weighted: 0.688463\n",
      "[200]\tvalid_0's f1_weighted: 0.688623\n",
      "[210]\tvalid_0's f1_weighted: 0.688633\n",
      "[220]\tvalid_0's f1_weighted: 0.688625\n",
      "[230]\tvalid_0's f1_weighted: 0.688756\n",
      "[240]\tvalid_0's f1_weighted: 0.688997\n",
      "[250]\tvalid_0's f1_weighted: 0.689028\n",
      "[260]\tvalid_0's f1_weighted: 0.689048\n",
      "[270]\tvalid_0's f1_weighted: 0.689081\n",
      "[280]\tvalid_0's f1_weighted: 0.689285\n",
      "[290]\tvalid_0's f1_weighted: 0.689308\n",
      "[300]\tvalid_0's f1_weighted: 0.689417\n",
      "[310]\tvalid_0's f1_weighted: 0.689298\n",
      "[320]\tvalid_0's f1_weighted: 0.689351\n",
      "[330]\tvalid_0's f1_weighted: 0.689366\n",
      "[340]\tvalid_0's f1_weighted: 0.68928\n",
      "[350]\tvalid_0's f1_weighted: 0.689471\n",
      "[360]\tvalid_0's f1_weighted: 0.689308\n",
      "[370]\tvalid_0's f1_weighted: 0.689287\n",
      "[380]\tvalid_0's f1_weighted: 0.689223\n",
      "[390]\tvalid_0's f1_weighted: 0.68936\n",
      "[400]\tvalid_0's f1_weighted: 0.689385\n",
      "[410]\tvalid_0's f1_weighted: 0.689301\n",
      "[420]\tvalid_0's f1_weighted: 0.689369\n",
      "[430]\tvalid_0's f1_weighted: 0.689186\n",
      "[440]\tvalid_0's f1_weighted: 0.689234\n",
      "[450]\tvalid_0's f1_weighted: 0.68905\n",
      "Early stopping, best iteration is:\n",
      "[350]\tvalid_0's f1_weighted: 0.689471\n",
      "CPU times: user 3h 58min 58s, sys: 21.9 s, total: 3h 59min 20s\n",
      "Wall time: 4min 37s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature  + ['pred_click_time']\n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    " \n",
    "print(train_x.shape)\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model1 = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',\n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model1.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,verbose=10, early_stopping_rounds=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(436612, 116)\n",
      "116 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff', 'pred_click_time']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.62554\n",
      "[20]\tvalid_0's f1_weighted: 0.681277\n",
      "[30]\tvalid_0's f1_weighted: 0.685308\n",
      "[40]\tvalid_0's f1_weighted: 0.68684\n",
      "[50]\tvalid_0's f1_weighted: 0.68814\n",
      "[60]\tvalid_0's f1_weighted: 0.688912\n",
      "[70]\tvalid_0's f1_weighted: 0.689646\n",
      "[80]\tvalid_0's f1_weighted: 0.689901\n",
      "[90]\tvalid_0's f1_weighted: 0.690619\n",
      "[100]\tvalid_0's f1_weighted: 0.690799\n",
      "[110]\tvalid_0's f1_weighted: 0.691275\n",
      "[120]\tvalid_0's f1_weighted: 0.691644\n",
      "[130]\tvalid_0's f1_weighted: 0.691879\n",
      "[140]\tvalid_0's f1_weighted: 0.691943\n",
      "[150]\tvalid_0's f1_weighted: 0.692097\n",
      "[160]\tvalid_0's f1_weighted: 0.692322\n",
      "[170]\tvalid_0's f1_weighted: 0.692292\n",
      "[180]\tvalid_0's f1_weighted: 0.692443\n",
      "[190]\tvalid_0's f1_weighted: 0.692386\n",
      "[200]\tvalid_0's f1_weighted: 0.69248\n",
      "[210]\tvalid_0's f1_weighted: 0.692407\n",
      "[220]\tvalid_0's f1_weighted: 0.692454\n",
      "[230]\tvalid_0's f1_weighted: 0.692478\n",
      "[240]\tvalid_0's f1_weighted: 0.692413\n",
      "[250]\tvalid_0's f1_weighted: 0.692634\n",
      "[260]\tvalid_0's f1_weighted: 0.692638\n",
      "[270]\tvalid_0's f1_weighted: 0.692563\n",
      "[280]\tvalid_0's f1_weighted: 0.692434\n",
      "[290]\tvalid_0's f1_weighted: 0.692462\n",
      "[300]\tvalid_0's f1_weighted: 0.692513\n",
      "[310]\tvalid_0's f1_weighted: 0.692542\n",
      "[320]\tvalid_0's f1_weighted: 0.692622\n",
      "[330]\tvalid_0's f1_weighted: 0.692747\n",
      "[340]\tvalid_0's f1_weighted: 0.692777\n",
      "[350]\tvalid_0's f1_weighted: 0.692796\n",
      "[360]\tvalid_0's f1_weighted: 0.692809\n",
      "[370]\tvalid_0's f1_weighted: 0.692918\n",
      "[380]\tvalid_0's f1_weighted: 0.692849\n",
      "[390]\tvalid_0's f1_weighted: 0.692839\n",
      "[400]\tvalid_0's f1_weighted: 0.69302\n",
      "[410]\tvalid_0's f1_weighted: 0.693153\n",
      "[420]\tvalid_0's f1_weighted: 0.693222\n",
      "[430]\tvalid_0's f1_weighted: 0.693188\n",
      "[440]\tvalid_0's f1_weighted: 0.692954\n",
      "[450]\tvalid_0's f1_weighted: 0.693179\n",
      "[460]\tvalid_0's f1_weighted: 0.693142\n",
      "[470]\tvalid_0's f1_weighted: 0.692994\n",
      "[480]\tvalid_0's f1_weighted: 0.692958\n",
      "[490]\tvalid_0's f1_weighted: 0.693167\n",
      "[500]\tvalid_0's f1_weighted: 0.693103\n",
      "[510]\tvalid_0's f1_weighted: 0.693048\n",
      "[520]\tvalid_0's f1_weighted: 0.692828\n",
      "Early stopping, best iteration is:\n",
      "[426]\tvalid_0's f1_weighted: 0.693341\n",
      "CPU times: user 4h 36min 20s, sys: 21 s, total: 4h 36min 41s\n",
      "Wall time: 5min 48s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature  + ['pred_click_time']\n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    " \n",
    "print(train_x.shape)\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model3 = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',class_weight=dict(zip(range(12),[0.85,0.9,0.8,1,1.6,0.4,1.15,0.9,1.2,1.1,1,1.5])), \n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model3.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,verbose=10, early_stopping_rounds=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.00387000000000004"
      ]
     },
     "execution_count": 101,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.693341 -  0.689471"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(436612, 115)\n",
      "115 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.627145\n",
      "[20]\tvalid_0's f1_weighted: 0.681712\n",
      "[30]\tvalid_0's f1_weighted: 0.68583\n",
      "[40]\tvalid_0's f1_weighted: 0.68749\n",
      "[50]\tvalid_0's f1_weighted: 0.688451\n",
      "[60]\tvalid_0's f1_weighted: 0.689001\n",
      "[70]\tvalid_0's f1_weighted: 0.689444\n",
      "[80]\tvalid_0's f1_weighted: 0.689886\n",
      "[90]\tvalid_0's f1_weighted: 0.69013\n",
      "[100]\tvalid_0's f1_weighted: 0.69033\n",
      "[110]\tvalid_0's f1_weighted: 0.690581\n",
      "[120]\tvalid_0's f1_weighted: 0.690735\n",
      "[130]\tvalid_0's f1_weighted: 0.690859\n",
      "[140]\tvalid_0's f1_weighted: 0.691115\n",
      "[150]\tvalid_0's f1_weighted: 0.691091\n",
      "[160]\tvalid_0's f1_weighted: 0.691506\n",
      "[170]\tvalid_0's f1_weighted: 0.691481\n",
      "[180]\tvalid_0's f1_weighted: 0.691555\n",
      "[190]\tvalid_0's f1_weighted: 0.691593\n",
      "[200]\tvalid_0's f1_weighted: 0.691533\n",
      "[210]\tvalid_0's f1_weighted: 0.691618\n",
      "[220]\tvalid_0's f1_weighted: 0.691599\n",
      "[230]\tvalid_0's f1_weighted: 0.691542\n",
      "[240]\tvalid_0's f1_weighted: 0.691534\n",
      "[250]\tvalid_0's f1_weighted: 0.69163\n",
      "[260]\tvalid_0's f1_weighted: 0.691558\n",
      "[270]\tvalid_0's f1_weighted: 0.691396\n",
      "[280]\tvalid_0's f1_weighted: 0.691583\n",
      "[290]\tvalid_0's f1_weighted: 0.6917\n",
      "[300]\tvalid_0's f1_weighted: 0.691739\n",
      "[310]\tvalid_0's f1_weighted: 0.691594\n",
      "[320]\tvalid_0's f1_weighted: 0.691849\n",
      "[330]\tvalid_0's f1_weighted: 0.69163\n",
      "[340]\tvalid_0's f1_weighted: 0.691506\n",
      "[350]\tvalid_0's f1_weighted: 0.691725\n",
      "[360]\tvalid_0's f1_weighted: 0.691598\n",
      "[370]\tvalid_0's f1_weighted: 0.691635\n",
      "[380]\tvalid_0's f1_weighted: 0.691497\n",
      "[390]\tvalid_0's f1_weighted: 0.691579\n",
      "[400]\tvalid_0's f1_weighted: 0.691707\n",
      "[410]\tvalid_0's f1_weighted: 0.691848\n",
      "[420]\tvalid_0's f1_weighted: 0.691782\n",
      "[430]\tvalid_0's f1_weighted: 0.691756\n",
      "[440]\tvalid_0's f1_weighted: 0.691754\n",
      "[450]\tvalid_0's f1_weighted: 0.691656\n",
      "[460]\tvalid_0's f1_weighted: 0.691708\n",
      "[470]\tvalid_0's f1_weighted: 0.691883\n",
      "[480]\tvalid_0's f1_weighted: 0.691894\n",
      "[490]\tvalid_0's f1_weighted: 0.691685\n",
      "[500]\tvalid_0's f1_weighted: 0.691856\n",
      "[510]\tvalid_0's f1_weighted: 0.691912\n",
      "[520]\tvalid_0's f1_weighted: 0.691877\n",
      "[530]\tvalid_0's f1_weighted: 0.691867\n",
      "[540]\tvalid_0's f1_weighted: 0.691985\n",
      "[550]\tvalid_0's f1_weighted: 0.691977\n",
      "[560]\tvalid_0's f1_weighted: 0.692179\n",
      "[570]\tvalid_0's f1_weighted: 0.691948\n",
      "[580]\tvalid_0's f1_weighted: 0.69195\n",
      "[590]\tvalid_0's f1_weighted: 0.692037\n",
      "[600]\tvalid_0's f1_weighted: 0.692192\n",
      "[610]\tvalid_0's f1_weighted: 0.692172\n",
      "[620]\tvalid_0's f1_weighted: 0.69201\n",
      "[630]\tvalid_0's f1_weighted: 0.691891\n",
      "[640]\tvalid_0's f1_weighted: 0.691885\n",
      "[650]\tvalid_0's f1_weighted: 0.691888\n",
      "[660]\tvalid_0's f1_weighted: 0.691865\n",
      "[670]\tvalid_0's f1_weighted: 0.691752\n",
      "[680]\tvalid_0's f1_weighted: 0.691668\n",
      "[690]\tvalid_0's f1_weighted: 0.691742\n",
      "[700]\tvalid_0's f1_weighted: 0.691455\n",
      "Early stopping, best iteration is:\n",
      "[600]\tvalid_0's f1_weighted: 0.692192\n",
      "CPU times: user 18h 46min 10s, sys: 8min 45s, total: 18h 54min 56s\n",
      "Wall time: 22min 55s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature  \n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    " \n",
    "print(train_x.shape)\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model4 = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',class_weight=dict(zip(range(12),[0.85,0.9,0.8,1,1.6,0.4,1.15,0.9,1.2,1.1,1,1.5])), \n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model4.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,verbose=10, early_stopping_rounds=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'data_' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-103-aa6e87f1965f>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m     11\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'importance'\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'importance'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mapply\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0mget_weight\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     12\u001b[0m \u001b[0mweight\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'importance'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 13\u001b[0;31m \u001b[0mtrain_w\u001b[0m     \u001b[0;34m=\u001b[0m \u001b[0mdata_\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mtrain_index\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mweight\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m: name 'data_' is not defined"
     ]
    }
   ],
   "source": [
    "data['importance'] = (pd.to_datetime(data['click_time']) - pd.to_datetime(data['plan_time'])).dt.seconds\n",
    "def get_weight(x):\n",
    "    if x <= 3:\n",
    "        return 10\n",
    "    elif x <= 10:\n",
    "        return 3\n",
    "    elif x <= 15:\n",
    "        return 2\n",
    "    return 1\n",
    "    \n",
    "data['importance'] = data['importance'].apply(lambda x:get_weight(x))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(436612, 116)\n",
      "116 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff', 'pred_click_time']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.62999\n",
      "[20]\tvalid_0's f1_weighted: 0.686637\n",
      "[30]\tvalid_0's f1_weighted: 0.691877\n",
      "[40]\tvalid_0's f1_weighted: 0.692723\n",
      "[50]\tvalid_0's f1_weighted: 0.69324\n",
      "[60]\tvalid_0's f1_weighted: 0.693462\n",
      "[70]\tvalid_0's f1_weighted: 0.693502\n",
      "[80]\tvalid_0's f1_weighted: 0.693666\n",
      "[90]\tvalid_0's f1_weighted: 0.693946\n",
      "[100]\tvalid_0's f1_weighted: 0.694364\n",
      "[110]\tvalid_0's f1_weighted: 0.694381\n",
      "[120]\tvalid_0's f1_weighted: 0.694302\n",
      "[130]\tvalid_0's f1_weighted: 0.694407\n",
      "[140]\tvalid_0's f1_weighted: 0.694495\n",
      "[150]\tvalid_0's f1_weighted: 0.694594\n",
      "[160]\tvalid_0's f1_weighted: 0.694734\n",
      "[170]\tvalid_0's f1_weighted: 0.694839\n",
      "[180]\tvalid_0's f1_weighted: 0.694897\n",
      "[190]\tvalid_0's f1_weighted: 0.694923\n",
      "[200]\tvalid_0's f1_weighted: 0.694993\n",
      "[210]\tvalid_0's f1_weighted: 0.695144\n",
      "[220]\tvalid_0's f1_weighted: 0.695364\n",
      "[230]\tvalid_0's f1_weighted: 0.695243\n",
      "[240]\tvalid_0's f1_weighted: 0.695322\n",
      "[250]\tvalid_0's f1_weighted: 0.695466\n",
      "[260]\tvalid_0's f1_weighted: 0.695248\n",
      "[270]\tvalid_0's f1_weighted: 0.695433\n",
      "[280]\tvalid_0's f1_weighted: 0.695487\n",
      "[290]\tvalid_0's f1_weighted: 0.695484\n",
      "[300]\tvalid_0's f1_weighted: 0.695395\n",
      "[310]\tvalid_0's f1_weighted: 0.695394\n",
      "[320]\tvalid_0's f1_weighted: 0.695319\n",
      "[330]\tvalid_0's f1_weighted: 0.695383\n",
      "[340]\tvalid_0's f1_weighted: 0.69549\n",
      "[350]\tvalid_0's f1_weighted: 0.695588\n",
      "[360]\tvalid_0's f1_weighted: 0.695674\n",
      "[370]\tvalid_0's f1_weighted: 0.695573\n",
      "[380]\tvalid_0's f1_weighted: 0.695713\n",
      "[390]\tvalid_0's f1_weighted: 0.695571\n",
      "[400]\tvalid_0's f1_weighted: 0.695556\n",
      "[410]\tvalid_0's f1_weighted: 0.695368\n",
      "[420]\tvalid_0's f1_weighted: 0.695531\n",
      "[430]\tvalid_0's f1_weighted: 0.69556\n",
      "[440]\tvalid_0's f1_weighted: 0.695562\n",
      "[450]\tvalid_0's f1_weighted: 0.695571\n",
      "[460]\tvalid_0's f1_weighted: 0.695452\n",
      "Early stopping, best iteration is:\n",
      "[362]\tvalid_0's f1_weighted: 0.695769\n",
      "CPU times: user 10h 37min 3s, sys: 2min 16s, total: 10h 39min 19s\n",
      "Wall time: 11min 17s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature  + ['pred_click_time']\n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "train_w     = data[train_index][weight].values\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    " \n",
    "print(train_x.shape)\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model4 = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',class_weight=dict(zip(range(12),[0.85,0.9,0.8,1,1.6,0.4,1.15,0.9,1.2,1.1,1,1.5])), \n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model4.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,sample_weight=train_w,verbose=10, early_stopping_rounds=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {},
   "outputs": [],
   "source": [
    "# pred_val = pd.DataFrame(lgb_model4.predict_proba(valid_x))\n",
    "# pred_val.columns = ['pred_'+str(i) for i in range(12)]\n",
    "\n",
    "# for i in range(1,50):\n",
    "#     a = pred_val[['pred_'+str(i) for i in range(12)]]\n",
    "    \n",
    "#     a['pred_0'] = a['pred_0'] * 1\n",
    "#     a['pred_1'] = a['pred_1'] * 0.8\n",
    "#     a['pred_2'] = a['pred_2'] * 0.7\n",
    "#     a['pred_4'] = a['pred_4'] * 0.6\n",
    "#     a['pred_5'] = a['pred_5'] * 1.9\n",
    "#     a['pred_6'] = a['pred_6'] * 1.25\n",
    "#     a['pred_7'] = a['pred_7'] * 0.9\n",
    "#     a['pred_8'] = a['pred_8'] * 1.2\n",
    "#     a['pred_9'] = a['pred_9'] * 1.1\n",
    "#     a['pred_11'] = a['pred_11'] * (0.1 + i/10)\n",
    "#     pred_label = a.values.argmax(axis=1)\n",
    "#     print((0.1 + i/10),f1_score(y_true=valid_y, y_pred=pred_label, average='weighted'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(436612, 115)\n",
      "115 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.628654\n",
      "[20]\tvalid_0's f1_weighted: 0.675592\n",
      "[30]\tvalid_0's f1_weighted: 0.682258\n",
      "[40]\tvalid_0's f1_weighted: 0.683994\n",
      "[50]\tvalid_0's f1_weighted: 0.68495\n",
      "[60]\tvalid_0's f1_weighted: 0.68593\n",
      "[70]\tvalid_0's f1_weighted: 0.686489\n",
      "[80]\tvalid_0's f1_weighted: 0.686682\n",
      "[90]\tvalid_0's f1_weighted: 0.686995\n",
      "[100]\tvalid_0's f1_weighted: 0.687394\n",
      "[110]\tvalid_0's f1_weighted: 0.687558\n",
      "[120]\tvalid_0's f1_weighted: 0.687651\n",
      "[130]\tvalid_0's f1_weighted: 0.687867\n",
      "[140]\tvalid_0's f1_weighted: 0.687846\n",
      "[150]\tvalid_0's f1_weighted: 0.687987\n",
      "[160]\tvalid_0's f1_weighted: 0.688088\n",
      "[170]\tvalid_0's f1_weighted: 0.687972\n",
      "[180]\tvalid_0's f1_weighted: 0.687947\n",
      "[190]\tvalid_0's f1_weighted: 0.688033\n",
      "[200]\tvalid_0's f1_weighted: 0.688008\n",
      "[210]\tvalid_0's f1_weighted: 0.688011\n",
      "[220]\tvalid_0's f1_weighted: 0.687892\n",
      "[230]\tvalid_0's f1_weighted: 0.688099\n",
      "[240]\tvalid_0's f1_weighted: 0.688145\n",
      "[250]\tvalid_0's f1_weighted: 0.688367\n",
      "[260]\tvalid_0's f1_weighted: 0.688312\n",
      "[270]\tvalid_0's f1_weighted: 0.688344\n",
      "[280]\tvalid_0's f1_weighted: 0.688352\n",
      "[290]\tvalid_0's f1_weighted: 0.688155\n",
      "[300]\tvalid_0's f1_weighted: 0.688249\n",
      "[310]\tvalid_0's f1_weighted: 0.688141\n",
      "[320]\tvalid_0's f1_weighted: 0.688121\n",
      "[330]\tvalid_0's f1_weighted: 0.688103\n",
      "[340]\tvalid_0's f1_weighted: 0.68834\n",
      "[350]\tvalid_0's f1_weighted: 0.688281\n",
      "[360]\tvalid_0's f1_weighted: 0.688097\n",
      "Early stopping, best iteration is:\n",
      "[263]\tvalid_0's f1_weighted: 0.688432\n",
      "CPU times: user 3h 30min 5s, sys: 19 s, total: 3h 30min 24s\n",
      "Wall time: 4min 2s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature  \n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    " \n",
    "print(train_x.shape)\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model2 = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',\n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model2.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,verbose=10, early_stopping_rounds=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style>\n",
       "    .dataframe thead tr:only-child th {\n",
       "        text-align: right;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>fea</th>\n",
       "      <th>imp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>115</th>\n",
       "      <td>pred_click_time</td>\n",
       "      <td>14674</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>86</th>\n",
       "      <td>std_dist</td>\n",
       "      <td>10252</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>94</th>\n",
       "      <td>std_eta</td>\n",
       "      <td>9901</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>o_lng</td>\n",
       "      <td>8862</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>o_lat</td>\n",
       "      <td>8510</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>d_lng</td>\n",
       "      <td>8430</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>92</th>\n",
       "      <td>min_eta</td>\n",
       "      <td>8258</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>70</th>\n",
       "      <td>pid</td>\n",
       "      <td>7758</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d_lat</td>\n",
       "      <td>7578</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>91</th>\n",
       "      <td>max_eta</td>\n",
       "      <td>7139</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>112</th>\n",
       "      <td>req_time_hour</td>\n",
       "      <td>6959</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>93</th>\n",
       "      <td>mean_eta</td>\n",
       "      <td>6385</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>84</th>\n",
       "      <td>min_dist</td>\n",
       "      <td>6021</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>104</th>\n",
       "      <td>svd_mode_2</td>\n",
       "      <td>5857</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>83</th>\n",
       "      <td>max_dist</td>\n",
       "      <td>5591</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>mean_price</td>\n",
       "      <td>5495</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90</th>\n",
       "      <td>std_price</td>\n",
       "      <td>5237</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>111</th>\n",
       "      <td>svd_mode_9</td>\n",
       "      <td>4843</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>102</th>\n",
       "      <td>svd_mode_0</td>\n",
       "      <td>4752</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>mean_dist</td>\n",
       "      <td>4737</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>103</th>\n",
       "      <td>svd_mode_1</td>\n",
       "      <td>4535</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>107</th>\n",
       "      <td>svd_mode_5</td>\n",
       "      <td>4519</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>105</th>\n",
       "      <td>svd_mode_3</td>\n",
       "      <td>4390</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>110</th>\n",
       "      <td>svd_mode_8</td>\n",
       "      <td>4386</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>106</th>\n",
       "      <td>svd_mode_4</td>\n",
       "      <td>4338</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>109</th>\n",
       "      <td>svd_mode_7</td>\n",
       "      <td>4233</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>87</th>\n",
       "      <td>max_price</td>\n",
       "      <td>4138</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>113</th>\n",
       "      <td>req_time_weekday</td>\n",
       "      <td>3971</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>108</th>\n",
       "      <td>svd_mode_6</td>\n",
       "      <td>3900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>101</th>\n",
       "      <td>first_mode</td>\n",
       "      <td>3314</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>52</th>\n",
       "      <td>p48</td>\n",
       "      <td>202</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>p17</td>\n",
       "      <td>194</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>p25</td>\n",
       "      <td>190</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>p53</td>\n",
       "      <td>189</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55</th>\n",
       "      <td>p51</td>\n",
       "      <td>151</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63</th>\n",
       "      <td>p59</td>\n",
       "      <td>149</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>p1</td>\n",
       "      <td>139</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49</th>\n",
       "      <td>p45</td>\n",
       "      <td>135</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>p44</td>\n",
       "      <td>133</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>59</th>\n",
       "      <td>p55</td>\n",
       "      <td>131</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>p18</td>\n",
       "      <td>121</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>p43</td>\n",
       "      <td>117</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>p15</td>\n",
       "      <td>110</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>56</th>\n",
       "      <td>p52</td>\n",
       "      <td>109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>71</th>\n",
       "      <td>mode_feas_0</td>\n",
       "      <td>108</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>p11</td>\n",
       "      <td>105</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>114</th>\n",
       "      <td>time_diff</td>\n",
       "      <td>98</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>p22</td>\n",
       "      <td>97</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>68</th>\n",
       "      <td>p64</td>\n",
       "      <td>89</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>p6</td>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>p16</td>\n",
       "      <td>79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>69</th>\n",
       "      <td>p65</td>\n",
       "      <td>77</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>p23</td>\n",
       "      <td>71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62</th>\n",
       "      <td>p58</td>\n",
       "      <td>64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>p19</td>\n",
       "      <td>57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>p20</td>\n",
       "      <td>42</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>p12</td>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>p41</td>\n",
       "      <td>38</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>p14</td>\n",
       "      <td>37</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>p24</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>116 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  fea    imp\n",
       "115   pred_click_time  14674\n",
       "86           std_dist  10252\n",
       "94            std_eta   9901\n",
       "0               o_lng   8862\n",
       "1               o_lat   8510\n",
       "2               d_lng   8430\n",
       "92            min_eta   8258\n",
       "70                pid   7758\n",
       "3               d_lat   7578\n",
       "91            max_eta   7139\n",
       "112     req_time_hour   6959\n",
       "93           mean_eta   6385\n",
       "84           min_dist   6021\n",
       "104        svd_mode_2   5857\n",
       "83           max_dist   5591\n",
       "89         mean_price   5495\n",
       "90          std_price   5237\n",
       "111        svd_mode_9   4843\n",
       "102        svd_mode_0   4752\n",
       "85          mean_dist   4737\n",
       "103        svd_mode_1   4535\n",
       "107        svd_mode_5   4519\n",
       "105        svd_mode_3   4390\n",
       "110        svd_mode_8   4386\n",
       "106        svd_mode_4   4338\n",
       "109        svd_mode_7   4233\n",
       "87          max_price   4138\n",
       "113  req_time_weekday   3971\n",
       "108        svd_mode_6   3900\n",
       "101        first_mode   3314\n",
       "..                ...    ...\n",
       "52                p48    202\n",
       "21                p17    194\n",
       "29                p25    190\n",
       "57                p53    189\n",
       "55                p51    151\n",
       "63                p59    149\n",
       "5                  p1    139\n",
       "49                p45    135\n",
       "48                p44    133\n",
       "59                p55    131\n",
       "22                p18    121\n",
       "47                p43    117\n",
       "19                p15    110\n",
       "56                p52    109\n",
       "71        mode_feas_0    108\n",
       "15                p11    105\n",
       "114         time_diff     98\n",
       "26                p22     97\n",
       "68                p64     89\n",
       "10                 p6     83\n",
       "20                p16     79\n",
       "69                p65     77\n",
       "27                p23     71\n",
       "62                p58     64\n",
       "23                p19     57\n",
       "24                p20     42\n",
       "16                p12     38\n",
       "45                p41     38\n",
       "18                p14     37\n",
       "28                p24      1\n",
       "\n",
       "[116 rows x 2 columns]"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "imp = pd.DataFrame()\n",
    "imp['fea'] = feature\n",
    "imp['imp'] = lgb_model1.feature_importances_ \n",
    "imp = imp.sort_values('imp',ascending = False)\n",
    "imp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "142 ['o_lng', 'o_lat', 'd_lng', 'd_lat', 'p0', 'p1', 'p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'p28', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'pid', 'mode_feas_0', 'mode_feas_1', 'mode_feas_2', 'mode_feas_3', 'mode_feas_4', 'mode_feas_5', 'mode_feas_6', 'mode_feas_7', 'mode_feas_8', 'mode_feas_9', 'mode_feas_10', 'mode_feas_11', 'max_dist', 'min_dist', 'mean_dist', 'std_dist', 'max_price', 'min_price', 'mean_price', 'std_price', 'max_eta', 'min_eta', 'mean_eta', 'std_eta', 'max_dist_mode', 'min_dist_mode', 'max_price_mode', 'min_price_mode', 'max_eta_mode', 'min_eta_mode', 'first_mode', 'svd_mode_0', 'svd_mode_1', 'svd_mode_2', 'svd_mode_3', 'svd_mode_4', 'svd_mode_5', 'svd_mode_6', 'svd_mode_7', 'svd_mode_8', 'svd_mode_9', 'req_time_hour', 'req_time_weekday', 'time_diff', 'price_0', 'eta_0', 'distance_0', 'price_1', 'eta_1', 'distance_1', 'rank_1', 'price_2', 'eta_2', 'distance_2', 'rank_2', 'price_3', 'eta_3', 'distance_3', 'rank_3', 'price_4', 'eta_4', 'distance_4', 'rank_4', 'price_5', 'eta_5', 'distance_5', 'rank_5', 'price_6', 'eta_6', 'distance_6', 'rank_6']\n",
      "Training until validation scores don't improve for 100 rounds.\n",
      "[10]\tvalid_0's f1_weighted: 0.629979\n",
      "[20]\tvalid_0's f1_weighted: 0.676169\n",
      "[30]\tvalid_0's f1_weighted: 0.6821\n",
      "[40]\tvalid_0's f1_weighted: 0.683587\n",
      "[50]\tvalid_0's f1_weighted: 0.684686\n",
      "[60]\tvalid_0's f1_weighted: 0.685643\n",
      "[70]\tvalid_0's f1_weighted: 0.686002\n",
      "[80]\tvalid_0's f1_weighted: 0.686622\n",
      "[90]\tvalid_0's f1_weighted: 0.687029\n",
      "[100]\tvalid_0's f1_weighted: 0.687314\n",
      "[110]\tvalid_0's f1_weighted: 0.687505\n",
      "[120]\tvalid_0's f1_weighted: 0.687876\n",
      "[130]\tvalid_0's f1_weighted: 0.68772\n",
      "[140]\tvalid_0's f1_weighted: 0.688023\n",
      "[150]\tvalid_0's f1_weighted: 0.688101\n",
      "[160]\tvalid_0's f1_weighted: 0.688142\n",
      "[170]\tvalid_0's f1_weighted: 0.688176\n",
      "[180]\tvalid_0's f1_weighted: 0.687953\n",
      "[190]\tvalid_0's f1_weighted: 0.687922\n",
      "[200]\tvalid_0's f1_weighted: 0.687888\n",
      "[210]\tvalid_0's f1_weighted: 0.688179\n",
      "[220]\tvalid_0's f1_weighted: 0.688165\n",
      "[230]\tvalid_0's f1_weighted: 0.688111\n",
      "[240]\tvalid_0's f1_weighted: 0.688236\n",
      "[250]\tvalid_0's f1_weighted: 0.688367\n",
      "[260]\tvalid_0's f1_weighted: 0.688116\n",
      "[270]\tvalid_0's f1_weighted: 0.688186\n",
      "[280]\tvalid_0's f1_weighted: 0.688353\n",
      "[290]\tvalid_0's f1_weighted: 0.688297\n",
      "[300]\tvalid_0's f1_weighted: 0.688018\n",
      "[310]\tvalid_0's f1_weighted: 0.688374\n",
      "[320]\tvalid_0's f1_weighted: 0.688418\n",
      "[330]\tvalid_0's f1_weighted: 0.688353\n",
      "[340]\tvalid_0's f1_weighted: 0.688385\n",
      "[350]\tvalid_0's f1_weighted: 0.688452\n",
      "[360]\tvalid_0's f1_weighted: 0.688541\n",
      "[370]\tvalid_0's f1_weighted: 0.688461\n",
      "[380]\tvalid_0's f1_weighted: 0.688351\n",
      "[390]\tvalid_0's f1_weighted: 0.68838\n",
      "[400]\tvalid_0's f1_weighted: 0.688178\n",
      "[410]\tvalid_0's f1_weighted: 0.688121\n",
      "[420]\tvalid_0's f1_weighted: 0.688067\n",
      "[430]\tvalid_0's f1_weighted: 0.688153\n",
      "[440]\tvalid_0's f1_weighted: 0.688162\n",
      "[450]\tvalid_0's f1_weighted: 0.688305\n",
      "Early stopping, best iteration is:\n",
      "[357]\tvalid_0's f1_weighted: 0.688633\n",
      "CPU times: user 3h 46min 7s, sys: 18.3 s, total: 3h 46min 26s\n",
      "Wall time: 4min 21s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "profile_feature    = ['p' + str(i) for i in range(66)]\n",
    "origin_num_feature = ['o_lng', 'o_lat', 'd_lng', 'd_lat'] + profile_feature\n",
    "cate_feature       = ['pid']  \n",
    "feature            = origin_num_feature + cate_feature + plan_features + time_feature + rank_feature\n",
    " \n",
    "train_index = (data.req_time < '2018-11-23')\n",
    "train_x     = data[train_index][feature].reset_index(drop=True)\n",
    "train_y     = data[train_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "valid_index = (data.req_time > '2018-11-23') & (data.req_time < '2018-12-01')\n",
    "valid_x     = data[valid_index][feature].reset_index(drop=True)\n",
    "valid_y     = data[valid_index].click_mode.reset_index(drop=True)\n",
    "\n",
    "test_index = (data.req_time > '2018-12-01')\n",
    "test_x     = data[test_index][feature].reset_index(drop=True)\n",
    "\n",
    "print(len(feature), feature)\n",
    "\n",
    "lgb_model = lgb.LGBMClassifier(boosting_type=\"gbdt\", num_leaves=61, reg_alpha=0, reg_lambda=0.01,\n",
    "    max_depth=-1, n_estimators=2000, objective='multiclass',\n",
    "    subsample=0.8, colsample_bytree=0.8, subsample_freq=1,min_child_samples = 50,\n",
    "    learning_rate=0.05, random_state=2019, metric=\"None\",n_jobs=-1)\n",
    "\n",
    "eval_set = [(valid_x, valid_y)]\n",
    "lgb_model.fit(train_x, train_y, eval_set=eval_set, eval_metric=f1_weighted,verbose=10, early_stopping_rounds=100)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 特征重要性分析\n",
    "### 特征重要性分析\n",
    "\n",
    "通过模型跑出来的结果,我们发现:\n",
    "- pid特征是最重要的,这并不奇怪,因为pid在本次比赛中是一种聚类特征,表示某一类人,比如有一类pid表示有钱人,那么这些人基本都是有房有车的,所以出行也都是驾车出行,那么他们基本都是选择自驾的;\n",
    "\n",
    "- 另外我们发现时间的方差和距离的方差也是极其重要的特征,这也很好解释,因为std可以认为是一种分布的表示特征,如果std大标明不同的出行方式的差别极大,比如从A到B,步行需要2h,而做地铁只需要10min,那么毫无疑问,90%的人会考虑步行.\n",
    "\n",
    "- req_time_hour也是非常重要的特征,不同时段人们选择的交通方式是不一样的,所以也是可以理解的."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>fea</th>\n",
       "      <th>imp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>70</th>\n",
       "      <td>pid</td>\n",
       "      <td>14065</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>94</th>\n",
       "      <td>std_eta</td>\n",
       "      <td>2969</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>o_lat</td>\n",
       "      <td>2300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>101</th>\n",
       "      <td>first_mode</td>\n",
       "      <td>2270</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>o_lng</td>\n",
       "      <td>2261</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>d_lng</td>\n",
       "      <td>2145</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>86</th>\n",
       "      <td>std_dist</td>\n",
       "      <td>2125</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>112</th>\n",
       "      <td>req_time_hour</td>\n",
       "      <td>2082</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>104</th>\n",
       "      <td>svd_mode_2</td>\n",
       "      <td>2032</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>84</th>\n",
       "      <td>min_dist</td>\n",
       "      <td>1998</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>d_lat</td>\n",
       "      <td>1997</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>92</th>\n",
       "      <td>min_eta</td>\n",
       "      <td>1909</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>91</th>\n",
       "      <td>max_eta</td>\n",
       "      <td>1646</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>111</th>\n",
       "      <td>svd_mode_9</td>\n",
       "      <td>1607</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>102</th>\n",
       "      <td>svd_mode_0</td>\n",
       "      <td>1582</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>93</th>\n",
       "      <td>mean_eta</td>\n",
       "      <td>1510</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>103</th>\n",
       "      <td>svd_mode_1</td>\n",
       "      <td>1506</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>105</th>\n",
       "      <td>svd_mode_3</td>\n",
       "      <td>1499</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>107</th>\n",
       "      <td>svd_mode_5</td>\n",
       "      <td>1403</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>106</th>\n",
       "      <td>svd_mode_4</td>\n",
       "      <td>1297</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>110</th>\n",
       "      <td>svd_mode_8</td>\n",
       "      <td>1263</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>mean_price</td>\n",
       "      <td>1257</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>108</th>\n",
       "      <td>svd_mode_6</td>\n",
       "      <td>1213</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>109</th>\n",
       "      <td>svd_mode_7</td>\n",
       "      <td>1163</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90</th>\n",
       "      <td>std_price</td>\n",
       "      <td>1118</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>83</th>\n",
       "      <td>max_dist</td>\n",
       "      <td>1095</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>mean_dist</td>\n",
       "      <td>1045</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>87</th>\n",
       "      <td>max_price</td>\n",
       "      <td>891</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>113</th>\n",
       "      <td>req_time_weekday</td>\n",
       "      <td>830</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>p0</td>\n",
       "      <td>753</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>54</th>\n",
       "      <td>p50</td>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>p18</td>\n",
       "      <td>28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>60</th>\n",
       "      <td>p56</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>68</th>\n",
       "      <td>p64</td>\n",
       "      <td>25</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>59</th>\n",
       "      <td>p55</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>p17</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>p44</td>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>63</th>\n",
       "      <td>p59</td>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55</th>\n",
       "      <td>p51</td>\n",
       "      <td>19</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>p43</td>\n",
       "      <td>17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>114</th>\n",
       "      <td>time_diff</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>p6</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>p1</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>p15</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>p22</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49</th>\n",
       "      <td>p45</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>29</th>\n",
       "      <td>p25</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>p53</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>p16</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>p19</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>56</th>\n",
       "      <td>p52</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>p20</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>p41</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>69</th>\n",
       "      <td>p65</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62</th>\n",
       "      <td>p58</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>p23</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>p14</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>p11</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>p12</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28</th>\n",
       "      <td>p24</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>115 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  fea    imp\n",
       "70                pid  14065\n",
       "94            std_eta   2969\n",
       "1               o_lat   2300\n",
       "101        first_mode   2270\n",
       "0               o_lng   2261\n",
       "2               d_lng   2145\n",
       "86           std_dist   2125\n",
       "112     req_time_hour   2082\n",
       "104        svd_mode_2   2032\n",
       "84           min_dist   1998\n",
       "3               d_lat   1997\n",
       "92            min_eta   1909\n",
       "91            max_eta   1646\n",
       "111        svd_mode_9   1607\n",
       "102        svd_mode_0   1582\n",
       "93           mean_eta   1510\n",
       "103        svd_mode_1   1506\n",
       "105        svd_mode_3   1499\n",
       "107        svd_mode_5   1403\n",
       "106        svd_mode_4   1297\n",
       "110        svd_mode_8   1263\n",
       "89         mean_price   1257\n",
       "108        svd_mode_6   1213\n",
       "109        svd_mode_7   1163\n",
       "90          std_price   1118\n",
       "83           max_dist   1095\n",
       "85          mean_dist   1045\n",
       "87          max_price    891\n",
       "113  req_time_weekday    830\n",
       "4                  p0    753\n",
       "..                ...    ...\n",
       "54                p50     29\n",
       "22                p18     28\n",
       "60                p56     27\n",
       "68                p64     25\n",
       "59                p55     22\n",
       "21                p17     22\n",
       "48                p44     21\n",
       "63                p59     20\n",
       "55                p51     19\n",
       "47                p43     17\n",
       "114         time_diff     14\n",
       "10                 p6     14\n",
       "5                  p1     13\n",
       "19                p15     13\n",
       "26                p22     13\n",
       "49                p45     12\n",
       "29                p25     11\n",
       "57                p53      9\n",
       "20                p16      9\n",
       "23                p19      8\n",
       "56                p52      8\n",
       "24                p20      4\n",
       "45                p41      4\n",
       "69                p65      3\n",
       "62                p58      2\n",
       "27                p23      1\n",
       "18                p14      1\n",
       "15                p11      1\n",
       "16                p12      0\n",
       "28                p24      0\n",
       "\n",
       "[115 rows x 2 columns]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "imp = pd.DataFrame()\n",
    "imp['fea'] = feature\n",
    "imp['imp'] = lgb_model.feature_importances_ \n",
    "imp = imp.sort_values('imp',ascending = False)\n",
    "imp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f1aece7ecc0>"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABKMAAAJRCAYAAACZcHQ5AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzs3X245nVdJ/D3RyZrqzUwxiceGsrZjOzJJrWt3VzZeNIYlCExk8EwEvFhc62g3Y1N81p7kiATQiDATKDhUUWRi7SHDZFREEVUZtGFEZQp0G3XKwv77h/nN3VzuO97zgznfO8zc16v6zrXue/v9/P7fb+/3/1wzvW+fg/VWgsAAAAA9PCYWU8AAAAAgJVDGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6GbVrCfQ27777tvWrFkz62kAAAAA7DE++tGP/k1rbfVCaldcGLVmzZps3rx51tMAAAAA2GNU1f9eaK3T9AAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgmyULo6rqgqq6v6o+Oabv9VXVqmrf4XlV1VlVtaWqbquqZ4zUbqyqO4efjSPtP1xVnxiWOauqaqm2BQAAAIDFsWoJ131hkrcmuXi0saoOSPKTSe4eaT4iydrh51lJzk7yrKp6fJLTk6xL0pJ8tKquaa09ONSclOTDSa5NcniS9y1kYtvO/uOJfatP/tmFrAIAAACAXbBkR0a11v4iyQNjus5I8suZC5e2W5/k4jbnw0n2rqonJzksyfWttQeGAOr6JIcPfY9rrd3YWmuZC7yOXqptAQAAAGBxdL1mVFUdleQLrbWPz+vaL8k9I8+3Dm3T2reOaQcAAABgGVvK0/Qepqq+Ocl/SXLouO4xbW0X2ieNfVLmTunLgQceuMO5AgAAALA0eh4Z9V1JDkry8ar6fJL9k3ysqp6UuSObDhip3T/JvTto339M+1ittXNba+taa+tWr169CJsCAAAAwK7oFka11j7RWntCa21Na21N5gKlZ7TWvpjkmiTHD3fVe3aSr7TW7ktyXZJDq2qfqtonc0dVXTf0/V1VPXu4i97xSa7utS0AAAAA7JolC6Oq6l1Jbkzy3VW1tapOnFJ+bZK7kmxJ8vYkr0yS1toDSd6Y5Obh5w1DW5KcnOS8YZn/lQXeSQ8AAACA2Vmya0a11l68g/41I49bklMm1F2Q5IIx7ZuTPP3RzRIAAACAnrreTQ8AAACAlU0YBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhmycKoqrqgqu6vqk+OtP12VX26qm6rqiurau+RvtOqaktVfaaqDhtpP3xo21JVp460H1RVN1XVnVV1aVU9dqm2BQAAAIDFsZRHRl2Y5PB5bdcneXpr7fuTfDbJaUlSVQcnOS7J9w7LvK2q9qqqvZL8QZIjkhyc5MVDbZL8ZpIzWmtrkzyY5MQl3BYAAAAAFsGShVGttb9I8sC8tg+01h4ann44yf7D4/VJLmmtfa219rkkW5I8c/jZ0lq7q7X2D0kuSbK+qirJc5NsGpa/KMnRS7UtAAAAACyOWV4z6ueSvG94vF+Se0b6tg5tk9q/PcmXR4Kt7e1jVdVJVbW5qjZv27ZtkaYPAAAAwM6aSRhVVf8lyUNJ3rm9aUxZ24X2sVpr57bW1rXW1q1evXpnpwsAAADAIlnVe8Cq2pjk+UkOaa1tD5C2JjlgpGz/JPcOj8e1/02Svatq1XB01Gg9AAAAAMtU1yOjqurwJL+S5KjW2ldHuq5JclxVfWNVHZRkbZKPJLk5ydrhznmPzdxFzq8ZQqwPJtkwLL8xydW9tgMAAACAXbNkYVRVvSvJjUm+u6q2VtWJSd6a5F8nub6qbq2qc5KktXZ7ksuSfCrJ+5Oc0lr7+nDU06uSXJfkjiSXDbXJXKj1uqrakrlrSJ2/VNsCAAAAwOJYstP0WmsvHtM8MTBqrb0pyZvGtF+b5Nox7Xdl7m57AAAAAOwmZnk3PQAAAABWGGEUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0s2RhVFVdUFX3V9UnR9oeX1XXV9Wdw+99hvaqqrOqaktV3VZVzxhZZuNQf2dVbRxp/+Gq+sSwzFlVVUu1LQAAAAAsjqU8MurCJIfPazs1yQ2ttbVJbhieJ8kRSdYOPyclOTuZC6+SnJ7kWUmemeT07QHWUHPSyHLzxwIAAABgmVmyMKq19hdJHpjXvD7JRcPji5IcPdJ+cZvz4SR7V9WTkxyW5PrW2gOttQeTXJ/k8KHvca21G1trLcnFI+sCAAAAYJnqfc2oJ7bW7kuS4fcThvb9ktwzUrd1aJvWvnVMOwAAAADL2HK5gPm46z21XWgfv/Kqk6pqc1Vt3rZt2y5OEQAAAIBHq3cY9aXhFLsMv+8f2rcmOWCkbv8k9+6gff8x7WO11s5tra1rra1bvXr1o94IAAAAAHZN7zDqmiTb74i3McnVI+3HD3fVe3aSrwyn8V2X5NCq2me4cPmhSa4b+v6uqp493EXv+JF1AQAAALBMrVqqFVfVu5I8J8m+VbU1c3fFe3OSy6rqxCR3Jzl2KL82yZFJtiT5apKXJUlr7YGqemOSm4e6N7TWtl8U/eTM3bHvXyV53/ADAAAAwDK2ZGFUa+3FE7oOGVPbkpwyYT0XJLlgTPvmJE9/NHMEAAAAoK/lcgFzAAAAAFYAYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgm5mEUVX1i1V1e1V9sqreVVXfVFUHVdVNVXVnVV1aVY8dar9xeL5l6F8zsp7ThvbPVNVhs9gWAAAAABauexhVVfsleU2Sda21pyfZK8lxSX4zyRmttbVJHkxy4rDIiUkebK09NckZQ12q6uBhue9NcniSt1XVXj23BQAAAICdM6vT9FYl+VdVtSrJNye5L8lzk2wa+i9KcvTweP3wPEP/IVVVQ/slrbWvtdY+l2RLkmd2mj8AAAAAu6B7GNVa+0KS30lyd+ZCqK8k+WiSL7fWHhrKtibZb3i8X5J7hmUfGuq/fbR9zDIPU1UnVdXmqtq8bdu2xd0gAAAAABZsFqfp7ZO5o5oOSvKUJN+S5IgxpW37IhP6JrU/srG1c1tr61pr61avXr3zkwYAAABgUcziNL3/mORzrbVtrbV/THJFkn+bZO/htL0k2T/JvcPjrUkOSJKh/9uSPDDaPmYZAAAAAJahWYRRdyd5dlV983Dtp0OSfCrJB5NsGGo2Jrl6eHzN8DxD/5+11trQftxwt72DkqxN8pFO2wAAAADALli145LF1Vq7qao2JflYkoeS3JLk3CTvTXJJVf3G0Hb+sMj5Sd5RVVsyd0TUccN6bq+qyzIXZD2U5JTW2te7bgwAAAAAO6V7GJUkrbXTk5w+r/mujLkbXmvt75McO2E9b0rypkWfIAAAAABLYhan6QEAAACwQgmjAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6GbVQgur6rFJnpakJflMa+0flmxWAAAAAOyRFhRGVdXzkpyT5H8lqSQHVdUvtNbet5STAwAAAGDPstAjo343yX9orW1Jkqr6riTvTSKMAgAAAGDBFnrNqPu3B1GDu5LcvwTzAQAAAGAPttAjo26vqmuTXJa5a0Ydm+TmqnphkrTWrlii+QEAAACwB1loGPVNSb6U5CeG59uSPD7JT2UunBJGAQAAALBDCwqjWmsvW+qJAAAAALDnW+jd9A5K8uoka0aXaa0dtTTTAgAAAGBPtNDT9K5Kcn6Sdyf5p6WbDgAAAAB7soWGUX/fWjtrSWcCAAAAwB5voWHUmVV1epIPJPna9sbW2seWZFYAAAAA7JEWGkZ9X5KXJnlu/uU0vTY8BwAAAIAFWWgY9YIk39la+4elnAwAAAAAe7bHLLDu40n2XsqJAAAAALDnW+iRUU9M8umqujkPv2bUUUsyKwAAAAD2SAsNo05f0lkAAAAAsCIsKIxqrf35Uk8EAAAAgD3f1DCqqv6qtfbjVfV3mbt73j93JWmttcct6ewAAAAA2KNMDaNaaz8+/P7XfaYDAAAAwJ5soXfTAwAAAIBHTRgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALqZSRhVVXtX1aaq+nRV3VFVP1pVj6+q66vqzuH3PkNtVdVZVbWlqm6rqmeMrGfjUH9nVW2cxbYAAAAAsHCzOjLqzCTvb609LckPJLkjyalJbmitrU1yw/A8SY5Isnb4OSnJ2UlSVY9PcnqSZyV5ZpLTtwdYAAAAACxP3cOoqnpckn+f5Pwkaa39Q2vty0nWJ7loKLsoydHD4/VJLm5zPpxk76p6cpLDklzfWnugtfZgkuuTHN5xUwAAAADYSbM4Muo7k2xL8kdVdUtVnVdV35Lkia21+5Jk+P2EoX6/JPeMLL91aJvUDgAAAMAyNYswalWSZyQ5u7X2Q0n+X/7llLxxakxbm9L+yBVUnVRVm6tq87Zt23Z2vgAAAAAsklmEUVuTbG2t3TQ835S5cOpLw+l3GX7fP1J/wMjy+ye5d0r7I7TWzm2trWutrVu9evWibQgAAAAAO6d7GNVa+2KSe6rqu4emQ5J8Ksk1SbbfEW9jkquHx9ckOX64q96zk3xlOI3vuiSHVtU+w4XLDx3aAAAAAFimVs1o3FcneWdVPTbJXUlelrlg7LKqOjHJ3UmOHWqvTXJkki1JvjrUprX2QFW9McnNQ90bWmsP9NsEAAAAAHbWTMKo1tqtSdaN6TpkTG1LcsqE9VyQ5ILFnR0AAAAAS2UW14wCAAAAYIUSRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6mVkYVVV7VdUtVfWe4flBVXVTVd1ZVZdW1WOH9m8cnm8Z+teMrOO0of0zVXXYbLYEAAAAgIWa5ZFRr01yx8jz30xyRmttbZIHk5w4tJ+Y5MHW2lOTnDHUpaoOTnJcku9NcniSt1XVXp3mDgAAAMAumEkYVVX7J3lekvOG55XkuUk2DSUXJTl6eLx+eJ6h/5Chfn2SS1prX2utfS7JliTP7LMFAAAAAOyKWR0Z9XtJfjnJPw3Pvz3Jl1trDw3PtybZb3i8X5J7kmTo/8pQ/8/tY5YBAAAAYBnqHkZV1fOT3N9a++ho85jStoO+acvMH/OkqtpcVZu3bdu2U/MFAAAAYPHM4sioH0tyVFV9PsklmTs97/eS7F1Vq4aa/ZPcOzzemuSAJBn6vy3JA6PtY5Z5mNbaua21da21datXr17crQEAAABgwbqHUa2101pr+7fW1mTuAuR/1lp7SZIPJtkwlG1McvXw+JrheYb+P2uttaH9uOFuewclWZvkI502AwAAAIBdsGrHJd38SpJLquo3ktyS5Pyh/fwk76iqLZk7Iuq4JGmt3V5VlyX5VJKHkpzSWvt6/2kDAAAAsFAzDaNaax9K8qHh8V0Zcze81trfJzl2wvJvSvKmpZshAAAAAItpVnfTAwAAAGAFEkYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgG2EUAAAAAN0IowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALpZNesJLFfbzr5wav/qk0/oMg8AAACAPYkjowAAAADoRhgFAAAAQDfCKAAAAAC6EUYBAAAA0I0wCgAAAIBuhFEAAAAAdCOMAgAAAKAbYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG6EUQAAAAB0I4wCAAAAoBthFAAAAADdCKMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHQjjAIAAACgm1WznsDubts5503tX/2Kl3eaCQAAAMDyJ4zqZNs5Z0/tX/2KkzvNBAAAAGB2hFHLyP3nnDm1/wmveG2nmQAAAAAsDdeMAgAAAKAbYRQAAAAA3QijAAAAAOjGNaN2Q186+7em9j/x5F/uNBMAAACAnSOM2oN98ew3TO1/0sm/1mkmAAAAAHO6h1FVdUCSi5M8Kck/JTm3tXZmVT0+yaVJ1iT5fJKfbq09WFWV5MwkRyb5apITWmsfG9a1Mcl/HVb9G621i3puy57ivredNrHvya/8Hx1nAgAAAOzpZnHNqIeS/OfW2vckeXaSU6rq4CSnJrmhtbY2yQ3D8yQ5Isna4eekJGcnyRBenZ7kWUmemeT0qtqn54YAAAAAsHO6HxnVWrsvyX3D47+rqjuS7JdkfZLnDGUXJflQkl8Z2i9urbUkH66qvavqyUPt9a21B5Kkqq5PcniSd3XbmBXkC3/wmqn9+51yVqeZAAAAALuzmV4zqqrWJPmhJDcleeIQVKW1dl9VPWEo2y/JPSOLbR3aJrWPG+ekzB1VlQMPPHDxNoBHuOf3N07tP+DVc2dSfv6soyfWrHnNVYs6JwAAAGD5mMVpekmSqvrWJJcn+U+ttf8zrXRMW5vS/sjG1s5tra1rra1bvXr1zk8WAAAAgEUxkzCqqr4hc0HUO1trVwzNXxpOv8vw+/6hfWuSA0YW3z/JvVPaAQAAAFimuodRw93xzk9yR2vtLSNd1yTZfo7XxiRXj7QfX3OeneQrw+l81yU5tKr2GS5cfujQBgAAAMAyNYtrRv1Ykpcm+URV3Tq0/WqSNye5rKpOTHJ3kmOHvmuTHJlkS5KvJnlZkrTWHqiqNya5eah7w/aLmbNn+Oxb10/t/zevunpqPwAAALD8zOJuen+V8dd7SpJDxtS3JKdMWNcFSS5YvNkBAAAAsJRmdgFzAAAAAFYeYRQAAAAA3QijAAAAAOhGGAUAAABAN8IoAAAAALoRRgEAAADQjTAKAAAAgG5WzXoC8Gh96m1HTe0/+JXXdJoJAAAAsCOOjAIAAACgG0dGsWJ8/OzpR1D9wMmOoAIAAIClJoyCeT56zk9N7PvhV7y740wAAABgz+M0PQAAAAC6cWQU7IKb/vD5U/uf9QvvSZL8z3On1/3YSe9ZtDkBAADA7kAYBcvAn7/9eRP7fuLn39txJgAAALC0nKYHAAAAQDfCKAAAAAC6cZoe7Cb+7LzJp/IlyXNf7nQ+AAAAlj9HRgEAAADQjTAKAAAAgG6EUQAAAAB045pRsIf5wPlHTu0/9MRrO80EAAAAHkkYBSvU+3YQWh0htAIAAGAJCKOAid59wRFT+3/q597XaSYAAADsKYRRwKN21Q5Cq6OFVgAAAAxcwBwAAACAbhwZBXSz6Y8On9q/4WXv7zQTAAAAZsWRUQAAAAB048goYNm5dMoRVC9y9BQAAMBuTRgF7JbeeeFhU/tfcsJ1nWYCAADAzhBGAXu0i3cQWh0vtAIAAOjKNaMAAAAA6EYYBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAbtxNDyDJBRcdOrHv5zZ+oONMAAAA9myOjAIAAACgG0dGASzQ2y8+bGr/zx9/XaeZAAAA7L4cGQUAAABAN8IoAAAAALpxmh7AIjvnHdNP53vFS53OBwAArFyOjAIAAACgG2EUAAAAAN0IowAAAADoxjWjAGbkrX88/dpSr/rZuWtLnfknk+te+zNzNb/zrunrev2LXacKAABYHhwZBQAAAEA3wigAAAAAuhFGAQAAANCNMAoAAACAboRRAAAAAHTjbnoAK8ibL5l+171Tj3PXPQAAYGkJowB4hDdeOjm0+m8vElgBAAC7zml6AAAAAHTjyCgAdsl/u+zwqf1v/On3d5oJAACwOxFGAbCkfmXT9NDqNzfMhVavvXxy3ZnHCLYAAGBPIYwCYLfx81dOD7be/gKhFQAALHeuGQUAAABAN46MAmCPc9xV04+guuRoR1ABAMCsCKMAWLGOuOaoqf3vO+qaubqrN06uWX/RUPPK6eta/7YkyZFXvW5q3bVHv2WoO21Kzf+Yug4AAFjOnKYHAAAAQDeOjAKA3dSRV54+tf/aF/z6UPcbO6j7r0Pdm3dQd+pOzA5AlJVjAAAgAElEQVQAAMYTRgEAC/a8K35nav97X/j6oe6MHdT94lB31g7qXjPU/cGUmlPmai4/Z/q6jnnFUPf2HdT9/FB3/pSaE6euAwCAyYRRAAC76PmXXzi1/z3HnDDUXbyDuuPn6jb98fS6DT871P3JlJqfmboOAIBZE0YBAOyBnr/pkqn979lw3FD3pzuoOzZJ8lObLp9Y8+4Nxww1V01d17s3HD21HwBYGXb7MKqqDk9yZpK9kpzXWpt+wQsAAGbqqE3XTO2/ZsNRQ917d1D3vKzf9P6pNVdvODxJsn7TB3ZQd+jUfgBg8ezWYVRV7ZXkD5L8ZJKtSW6uqmtaa5+a7cwAANgdHb3phqn9V204JEnygss/NLHmymOeM9T81dR1XXnMjydJXnj5jVPrrjjmR5Mkx1x+88Say4/5kSTJhstvmbquTcf8UJLk2Ms/MbXuT4/5viTJT1/+6al1lx3ztLzoirum1lz6wu9Mkpx4xd1T685/4YFJktdduXVq3VtesH+S5NeuvHdizRte8JSp6wBgtnbrMCrJM5Nsaa3dlSRVdUmS9UmEUQAAsMK9+cr7pvaf+oInJ0nOvPKLU+te+4InJUn+8Ir7J9b8wgufkCS58IptU9d1wgtXJ0n+5PLpdT9zzOqp/QC7s909jNovyT0jz7cmedaM5gIAALCoLr/8b6b2H3PMvkmSq/90ct36Y+dqrr10+rqOfNFc3QfeNb3u0BfP1X3wnZMDtf/wkrkw7S/fMT10+3cvnau78aLpdT+6ca7u5j+aHAgmyY+87Am55bzpNT/08rng8BN/OL3u+35hru7Tb/vS1LqnvfKJSZItvz+57qmvnqv5/BnTg881vzgXfH7ht6cHqfv90lyQet9vTT6S8Mm/PHcU4Rd/+3NT1/WkXzporu5375xe95/XztW95Y7pda/7nqn9kCTVWpv1HHZZVR2b5LDW2suH5y9N8szW2qvn1Z2U5KTh6Xcn+cy8Ve2bZPo37sJqFrtupYy50LqVMuZC64w5+7qVMuZC64w5+zpjzr5upYy50Dpjzr7OmLOvWyljLrTOmLOvWyljLrTOmItX9x2ttYUd1tla221/kvxokutGnp+W5LRdWM/mxahZ7LqVMuZynpv9sTLHXM5zsz9W5pjLeW4rZczlPDf7Y2WOuZzntlLGXM5zsz9W5pjLeW72x8occ9rPY7J7uznJ2qo6qKoem+S4JNNvzwIAAADAzOzW14xqrT1UVa9Kcl2SvZJc0Fq7fcbTAgAAAGCC3TqMSpLW2rVJrn2Uqzl3kWoWu26ljLnQupUy5kLrjDn7upUy5kLrjDn7OmPOvm6ljLnQOmPOvs6Ys69bKWMutM6Ys69bKWMutM6YS1s31m59AXMAAAAAdi+7+zWjAAAAANiNrPgwqqrOq6qDx7SfUFVvncWc6KuqfrCqjpzh+CdU1VNmNT59DDdauKmq7qyqS4ebLixrVfXOqvpMVX2yqi6oqm8Y2vepqiur6raq+khVvamqtlRVq6p9R5avqjpr6Hugqj47LLOpqr513lgbhuXX9d7Ohaqq86vq4/O3oar+fVV9rKoeqqoNE5b9/ar6v4s45iuq6hNVdWtV/dW4v2Nj1jX29Rz6njOs6/aq+vNdmOfEdS9weya1f+PwedkyfH7W7MLc/nLYtlur6t6qumpor+G9+9mquqOqXjPSvv19e1tVPWNkXc8dXutPVtVFVbVqpG/sNgx9P11Vn6qqB4ef+dt5xsgcP1tVXx5Z9sKq+txI/w/u7D5YTFNeq9cN23hbVd1QVd+xgHVNeg3WD+u5tao2V9V7Jox5QlVtG9k3L5+wvpcMy9421H9+/nu1qr6tqt49jHN7Vb1sgftj0vfkL43M65NV9U/D7/nbcGBVfbCqbhn6Fvz/SM37XhnzPvpqjf9uHt2/f1tVd4+Z/9Oq6saq+lpVvX7euIcMn4Pt3z9PHekb+34dsz++XlWPH7MNo++je4e5jX3vV9WPDOsZ+7270P02tO3oM/odw/v6tqr6UFXtP+E1eMTrOeE9OXb/jlnfI74Dx9TMn9tlE16Dqod/t715wvtj9LPwYD38M/aDI3UP+7sxbjvHvFYfnDC351TVV4a2rcP7cv68FvQeGqnf/n/FuyeMOfq/zLZh3Pk1C32ddvpv8pj51rR9ONS8atxrNq1v3r69dXhdx31njX5P/nVV/cCY7Zz/nfvhCeuaP+YtE16D0e+i/z28BuO2bfS9dueE7Zz/v+lVE8acv53vnbANVfP+FxizPx72/9+k13An9sfoZ+/LVXXfmHnN//v4lknviaF++2fvvAn77WHv8Snvox3+HR0z5o6/mx/Nrfh2558k/ynJN49pf02Se5P8Y5K/3on1rUnyxiRf3UHd3kleuYD1/d/h939P8voxc39akgeTbJjXd0KStyZ5TpJ/O9L+iiTHj6nbPs5TkmxayLyTHJXk1HHznbDsCUk+MmY7LkyyIckNSe5McvC8/pckuW34+WKSF82f/wLmOrFuIeuat692qi7JPkmuHOb/kSRPn1D/oSSXJNky1D5jdKwdbOv8MZ+W5MYkXxvd3/PXkWT9MNatSTYn+fEJ63tOkq8Mdbcm+bVp+yPJjyT5eob3ZUY+Z5m72+UdST6e5PYk5y1kv07bB8P8PjnM8c/H9L11ZO5fSNKSPHXon/r5TvKrwzreM+l9P6/+1iRHTum/LMlvJzk1yTlJ3jSuPsnnk+ybuc/HZ3Ywx4uSfGL+stOWGbOONUnuHl6bD87rOzJJDT/vSvJ/krxh2I7Th5qnJbkpydZhPfvOW/59w/LPTfI3ST6T5O8z937/huF9sC7JX2TufXvImNf+KeP20cjzYyfMf0F1o6/vsN9/d8L6np/kY8N77vYkvzqyD78/ycWZ+0571bB9bXgt1yV5R+b+rhw8sr7zM/d5uC3JpiTfOmb+a0aev2X7ezDJ4+bN/f3jtnN0H4x5PU8e2r9/eE0OHJ4/Ydq+GhnnL/Mvn6+/TXLV/HWPWeZxI9v2tiRXj+6rke383cx9l/1jkhtHvkP+547mtYP3++UZ/hYmednwmj1m+3aP7KfbR+b10aH9MUnuGZbZkuRLSX59/raNea3WJrklc38THjcyzj/XzJvj3yZ558jzCzPvb/1SfQ7mLff/2zvzeLuL8v6/JyQEEvZFFgmEKIiKAi9ABcSmoKIWClqqYjFAXfrTClqkWpVSrFiL0KqtCipCtCjQJBAQKHtYwpawJOGS5Wa7XLKR9Wa/+/z++DxPZs4359wEExJK5/N6nde955z5zjzzbPPMM8vZqJzrJPJfzUbrHfbdn5L8/ZeAW+z//wTWN6DtAuPlXNOjJuBoYBfSVRLvBpob8PZ8ase2HxpNk4DxwHvs8xOBPe3/f0Q+q2oH3wauBJ4A9gWWAztuhk7VtatKmVeAxxv04ZcZDe8AWhrYcrWv7lfqxl/Ahcgmh9bRgZy/XwZmG09Wo5hgR+QHjjdZV+O3ZuDt2fMjN6WvlefPAB6q1wfTo7ORP34y51uljh2sjrs31V7luY38MZtho8Ao4Dz7/xSrox79VXkupb6f2Yi/Der7MnCt/f9p4L46Zaq0za7HE2rH5PcBLzTQj28DV9r/N5le7Fipaw9gKmncuKheP+vIalwD2oaT4q1j6tG1uTpk3++K4oqngD80aDOPZW5D/sfnDweicXlz5dRwTN5MvVyD/OEobD5mbd+EYoS/Q/HWDDQ+zavyphHfct725bOo9ZMfNX5U+3k+tX6oUV3VNkc2kEHuiz6JdLdKf1XXTmnQz2psuqBBm9V+Tm/Qh6q91OPHUGrjv0ZxxebyI7e9T2PjUIWu6vg4tx4/6tjeJQ34VqPjNNajTY6jddrcpG/+X3+B+eYiaDX1HsS4Y4C3AGNCCHcj5/JM0CrY1SjYbEaTjmo9/WOM3XWaGAp8CgmgL+yBBpafN6DTjaEvfA3Y1Mr1cOTYngCIMV7bV+EY4wJkRI2wge4Y4x3AHZto/9VgBXIaF4cQ3ocmZzcA3wQGA+eghM3IEMI/oKD2DtAKGkry/HUI4V3IIM6mwmOT/732dgiaeP0Y+AgywDUhhA+gIHY/4GBgMUqCPQZ8Ak1EB4UQfh9jfCKEMBwlC5cCR1o/ns369W1gUozx4yGEI4DfhhDWIyeyFBnpW4D3ogTOLOCr1rdOYPeg3TNPvApeLkcBwVmbKPcgmkDEEMK7UaLkiAZlH4sxnu5vQgjn1ysUQtgBBfL3Zh9/DbgxhNAJDLM2vxlC2Bc5z9/WqScgJ9vbVwdCCHsgGX8kxtgaQnhTpchwFNR/xcqfYf1sA4gxnthX/Uh+G/rdSO9DCDvEGHuQPn0shDCVWl/TDJyHBs/9o34F9AQUsO5O3z/A8J1N0PgppCtbir3sb2sIYQqieUTUD0QAEEKYAPwZssedgXazuREo4eLyGhJCGI/6fzrS651jjA+FENqRXMYBA4HPxxg/H0L4MZpEjq5D2/koAFjQB/2fQwHkuE30s2E5l28IYSTyC4Mr/DgPuAY4FSXPJyL5EmNsMR9zJvIdB9ozA1AC4yrgXODjMcapWbN/F2NcBdqlC8wJIfwPSW8C8uNuFzuj5Aj+nGGwf274itMPnARcH0L4dB15HmRv/xxYFWNstboXhxDq8qoylu5udI5AAdrt5lMmAO8MIUyn1g5GAHvZ9/ugQGsZ0pN7Kv1cgnzZzWhyDNKP3wKn1ZNhnXHe23yTff4sktEOIYTRKGHyGfc1McbFVtWZwJ1oLFkLHBlCuBP4WzTGHwx0A3OAS0MI70T24rIM1ua5IYS3o2C3FeiIMa4DVlXlWaF/D+Ako3FEtZ99YIvtoIINepTZQUCTwCFozPgZkhWV+p5C/T/O+jOgTl0jkAwmAzfFGEdXZRhCaEaLCj1Qox97mH4tBY4IIexv9V0InBVjvDeEcClwf9BOuGOA5hDCCKvv/2W66nYAmsCehGK65UjOWNs1tNHYT9bT/V0w/1ZH9hElQEA2lfu63Ja/B3SEEAahxP1VwGeAjzeg7SDgH80/Qa1vzvnxPEq+fQc4APgs8LkY4zXW1t8ALVbW5RaBt4UQbkM2MqSRvjagrQfZtvfhEw38xYnA4dU6DRei5PLxm9nmiIxv56K47tYQwtP0baOua/sjX3CG1TXWeFyVQQDOCCGcinRtIPDPWUwzqNLX/sB/WQxVjx+fACaaLG5FPvCQSptV2oY04EcAdkIJjxFoktthxXL9+DCwLoTwXWS7azBbyOpqA/YG/t1041wyf1rp54dQQraJzN4qcnq/PTMoxvi8fd+XPKs6VLWDj6HETU2MVCnzceDZEMJVaFFzP6RXPzfeHGn1N5LTx73eTYzJNWjQH0hj0iwrcz+yyTnIL+1ifb4BxSE/DdqB5b6oEd9agBPdRis+aybwPZt/5X5hAkrSfqjC2xqf24f/q2lzU/23Nl9G8tq5Uq5G12KMDzXo5y7Uxqa7IJ9at10bXy4Cdo2WRanwY0MMi2K+YWijRD2fu5u9r4krSHawufzIbe9p0ji00diSjY/tmZ+v1pfbXksf5TboeEWPGo0bfY2jDX1zXWwqW/VGeCGn8BByDHOAf0JObCFKIByHJs09aEK1HO0AmBJT9vJh5IyXAr9ChrEODcIfQoN5r7Ux35Rpsn2/Hq3SXWTPRSs7A7gYZbjXo1Xy55HziUZPu7XThlZxvI119v8Coytmz3Sj4HC+9bfN+jrL6ui251dZHccjZexAK5hzjB7nR5d9F+3vPNKunUeMJ9FobUFB6Fft83aro8eebTGe9NrLeRlRILAo60tEk4c1lc/arX8zgW+gFZtF9vlJyGCi0b3G+vlC9tk4NOj0GC2dWb1XoFWndvt+fcbXNVa2HTmlVvt8bkZjL9KpLjTJarN+L7bnZmQydL4sRZOtpfac0+VlvNz0rL6V9t5l2J3R6XT1Gl3Ls/f+ajX6euwVrW+9WVnX8S7722ptrjT6piL9683q6rV6vP5cbi8ZPR1Z/7z9Tnt1ZJ/HymsdsqHVGQ+j9b0NDRAr7X0HyRa8DX9mFXLMnSQ9X5/R0Ja1n7fTa31+3Ojvtrqcfqd7HUnXXqLWjpymNvtsHdKLmJVzWS+z9+9BgZHbf6/R67rptC3Jnl+LbN53g3UYraPQgDTDPmun1uZyfehFOrkuo9v5sYJavcp19flMXu5nL0M6kbczB03OvN7bMv55H10nPRDuQMH4yyiJN4WkZ93W75X2mevSaDSxXE2tPo0y/rST/GKn8aM3q9P10n2Zy77F+NNsnz1nz0wxel2ubUjfptj781Dy7wnr41rks26075dkPOhCuyJm1+HBV6yvTus1aDzJ5TnT+L/O+rEO6eQkkn2vsb+r0Tj3PLX+bDJKwFxl/ztf5wC3GA/8s++gJMxKkh3MQ0nT35FWdF0HxqPdfb+3OlbYc+NIO2wWA9+z/y8zmmYZPU7XROPvt63uD6KE+1Lr/xfs8x/Y++X2d7XR/Axa/TzM2vFEVLRyj6IdXH9vz/zQvvtvo/smtKI4HemDj8G5Psw03rt/cV2/GU1oVlDr/8aRYgD3Y6vsfSM78PIvoDjiFeNfp/1/ndWRjz8LUCJxEtI315P7SeNKl8l7WUaL2/d648kSa/dM4+G1Ju81GS8i0oMdrOzLVl+X0bWaZM/rTU6rrU+rMnpmokRYNLoXIp3zcfsak+lsK3MSWjiaZc8uQYtMA0gLj783+Y0j+ZwLjH/LjZ5mq286sts25AceMBk8jcaIv7RyE0hxgMc4dxuPe6zeXwBvtu98jBiZ6bvrhMtvqfFtIbKtifaM+5fZKJl6i/HsLmS3nWiBLwInWf0PZv1rQ0mK59BO+ntjWvWPpJjverRyfjLS2YhioN2y70Yi2U8BfoSSSXm7o43HE60fvyONZy+jMe9R5B+mmVyXIl2dg2zzzSj+3IFshwFa7HgULYpFNK70M9o6UKz7Ekp8+Fh3EvLH60jx1X3G84Wk+PF5FNteb21G4DvW7pqMV39uz6wlxe+5nxle4cfzaOfOV9ECBaQY6SQ0Zt9svPXYep9Km/fVoW1mHRk8bn13WT2I5kAtaKLpbe5qfV+F9HJenbpuQclo18WqP/V+nmmyugHJfBxJP36dtTkcyXqBPf9O0k5x76fz7DfG229lPMv5cT6aDF+PbOIPDdr8F+PvJcjvu29cbDR4ufNNjouQTY9B420PmvPsZTT8E2m8nQgc0WBe6nQ2WblJSPeWoTFtPRqPPX6bhGxzFYpbnrTvPpbbZVZ/lW9fsbqXI//1Tis3wGjIeeu6MRZ4qg5vryCNo1OAIVldz5H8X7XNsfVkUMcXnVCH/qqujWjQz9+i3YzXAz8xWbbQ2Bddj3Tjunr8QDpxJ8n2mpG95PzwumYivVxGfTvYLH5Qa3tr0CJwlbcnobjB47gTGvCjantnNyhX44sqenQM9fXjEjQmOW0n2/d1fXOfeZrtnSjaRsmov0DBYqu93x05lLuRsZ+GgpNbjPFPu4JY+T+YUuyKsqxdwPftuyPQYDnclLTDPu+PVk+DCXwxGuw6rfxQ5MS+nynDj9Cquwd6u5EmY2+x/2eREksRTWDa7bsXjfZe+/8SNCDegnZ8+QSjGRnzI8jJLUAJqRet7l5kjPNNwR6y+lcg42lBA+OBVuftpABtIXIYq1FQNo6UABhvdd9lNHXYywPV41DyKaIk0x3IIb+U8WGa0bbOXtejTHUXChzOMrrb0EThcZPzamtrGilhstzqeJHaSbAnePKJvydkfKA6y/jXixKZPqFbiAYmn8T+FgX63dmzPSjYyZMePyQF9KsqbeeJofusrheRDrvTbDO5+sRlFCkpMYMUUByV0XADaWIyhTTpvo+UTPglaeLkCY6njY9LrM3l1CYaO5HutVg9b0dO0xMXi0kTk9tJk9CYyXklKfniiTqf+CyidpLkvBqPsvFPoMmBy2IktUmT2fZcl71/xvgwGdlTN/D1rI01wCCSfhxImvB/FTnvM7L6XyAlLTzRvMC+u9L6+iPSpNoTei+jyW5EA3OL8Wam8Xg+mix4YnOePefB8y/t2V+jidWijJ8zkO0utdcUpHO9SA98EuQBt+vLeut3i/EzWrlTM3ktz+TZiiYm3WiActmtRHbcm/W9y8quR8GNT6Y8QeXHSFfb82PQwOpB2JVIt56w+n5gz7xiffdkQhdKVEar571Gcy/wr2hSv95oGYVsySf9E0iLE57Mn4r8sfvI4Zk81mRy+Zy12W48Dsa7h1ESfiVwKJocdZGO13ryzPXpEKRHrWiC/x9oQtaEAptdUfDViyaR/2H0rkJBjE9UHkC69Ary3+uBq228mmLPDzZa25Gdn4D04VvApUaX+7ZhyF/fb38nWP892XmTtTUJjbm3ksY+9wHXo2BmISk4usZovaCajLL367GjnMAXgUvt/4HWj/lo/N0N7Ua82+puRePZ/fb5WKP36/b8J9AuUNAYdbY904Imt39jz4xHOtCB9Ha18d1987us/WXGx4CCwKUmNw86L7LPWky+l6Gkmfula5C+tNn3zyJdfAnZdz07WIPG0SnW10tJOhCt/jtJ4/b1yL6nGp+eQzb7GeTHJiA/9q8mBx8bPIm9Cvmty1DC/MvITs5FY8OTpIliD7Aok9v1xk+fdL9svPBFpH2QXoxHscV8e/bDyFYvMfk8gQL0GUi/fayah3yrx2SXI9//hNE2HyVBRlv5Q5HMf2Ttv5UUmxyPEhpT7bsHTebjrZ45md4sISW/L0PjT4vVc649/wiy5RYUC92GxoV9UaJgLRozD0B69grapTHLaLjP2pludK+1vrUa32ah8WAR8KtsUvF+LA62z9wOzkB68Cu0w3cI0JRNmleSjif5M7eiOKgVjVvXZd8dgPR+IJqc/Xul3Z8YP55D48SzSO6tyMePRXq92L77LfL/v0ELprOQr36f1TeS2mRUO0qWtSJ7P9to8zi1f66Txvf1yE5vR8dE5xndDyBde9l4fDspsdxBSkL45LQVJSm+bm16nHhJ5mcmVPjxG6S/4zPa1pLmLC+i40v3WZnZaJdI3uaBJpOZ9v0C5HdrZIB09P2ZrB4EjiVNOr3Ns5FtjEXjwFykW3ldP0Ux1seQ/+gFflDtp8vK2mwFPk/Sj9FAmz2zm9U11v7OpHbSnPPsn5F+5DzL+fGw/e/XA5zWR5v3IB0fjeLAWdbP64BOK3c+8nfN1ueVyAesQT7ja1lS5TDkQ+8CHmowLx2KbG5EZhPdVt8VaBz9G6RjTZntTScdteomHW0/BRhbSSJs4Jv1cxcr9yQw08r9Cvniql+YaPS9qQ5v9zY+noLG94eyun7cR5tz6smg4lc+gGyuSn9V12aixFK9ft5gMmw1+o6isS+6GPnqvevxw2R4Ecn2HkL2soEfWV33kOKgr9exg83iB7W291ZkeyNz3lb49hjwQAO5V22vXjKq6ouaKnp0TJ02faFgb+PZj7Pv6/rmvl7/Vy4wfwE5391DCCfHGFfa59H+HokMv93ej608/yZkbKtjjJ7QuSSEMA8NkLthWwozBBTAebC2Dwrem4GeGGMLafUc5CwD2h7tyvkoGqAD8F0rdz8KdO8mJQz622soykj2oIGpH/A2tLX7IOTEorU1yt574sqDrlYU8A5DZ+eHkbZI91gb89EE6Hg0kP83mpxchRTTd3uMRk5gKWmFMhi/P2T17UDann40CkxAgeOJaPIdrX89KEH1XdIk+YyMJ4PQivhTpETHLaTJstftyQtPpPiKsK/w7oiCpAnWrwGkFZge68MvkU65DnmiaT2aKGDl/gxNhCElQFahgeeyjO+DrfwE462vqLmcdzK6TjGaliFH4ImP25GMe62eD5J2swwz3gTkaJze05FjhDSBmk3aXbIjCogDaQfRYBTAY7QdZHX0tzoH2v/H2PO+Q2AyaVfHFzMZ7GTfLUf20N/a2y37/yhkXz4535d0lNXbXYsCgG+hQPokZHNHoSNX7ut6kV532bNtKAHlR5ZXGB3T7ZlobS8j6eJ7kL/oRQHz7igREuyz/YxfLyL76yZtE/57dGTlb+39S2h7fReyWT+W9zajs7/9DcgPzTN+D7BnQfrgyagutAvlU0bHAPvsUOSHBqMV4z1Qkr0LBVL9Mvm8FelgfyTPgcguFluZfqTjlW4vjoHGP5+AD7O2BiO76CbtKulG9t8fTZL9mOjHjH/7oPs2drLn5yF93xnp4n+jpNhQa+/TJF+1M5qIu736hHww8qu72+fnIZvayfh1AOkI4QAUdOxp/DjU6D0cBSSTjMZfW1/2sTqfsbY+aLS4rc013p1obT4cY5wbY/wI8qMvGS/fhWzLjwd50n9X6+/59tlBwP0xxtUoOA0oAP8A0uMdUMAAWmE+0Xj1JuQ3+wMfsCNT+9vzuxoPdrS+32Ly+yjaVn638bQf0pOp9v277bnBaBHmy8g/7mSfY2Va7f/1pMWNU62OA433pyBd/Qsru8q+J+iy8P5I30GJiREhhEkoeeY2E9CK96+s3/sbPw5FfuILaHeaj+GghMC77f95Rrf79wNJO3ZWIt+9COmSL1JEYG6M8YWoo/xN6LhxJB0p2QsFZ+tRMNqL9GwX5IemG52+aLMa6eJhyJZ2QXa1P/XtYBBKAh6OZDkCLXj5WHAUupNwZ2RnnzEZvcX4/k4k1/2N54dYnX+OxvYjjR8DrS8+Nl8O/LXxZYj9vRiNAweEEE6255yfH0YxwHDS2P45a38Ha/MBJM83I99zQAhhGUrS9UM66QmeG9A40mTf/VWM8SDjde6fpiIZn47GpbeixbMJMca5aCfUrQAxxlnYIlyMcWLUkQvfpXMimqAcj/xGbwjhn5A/bTfeD0QyhbRjaxrSl8NMZgeSEgMPxhiXxBjHW5kPo8T5KqA3xtiJ/MAQK78PGieajV+HmtweMJ59ENnbB0MIVxo9nvjLEVHMNPTkyVYAABcASURBVNCeu7giq+r/IP95FPKBEfkJP/YeY4wLo9CBZHNUpY4T7P2BaFL5NiT3A0h3GB1C8j29SC43WD/3Q0nXm0MILWjy9vMQgl9PMAH5kIjs4f1Z23sAs+y5fsbH9yGbfwjxehJKDn0E2d1lpPH0H1CSwGOU56yuQaSdLJ9DdgnpiM04e38baVdfjiFIH522nUm+cx7pTpe3Gn+ezduMMS6IMX7CaHvR+rGyjgzmkY7wRTSO5MdCna4L0CQ3xhifROPXsDp13UParenznWo/j0MLIL9HMv8XtIPSF192RA35IlCMOvrlx9yrtIHdT1fhmfMD5Ksetjb3NLqPbdDmD5Hs/xL5uU4r/3Zqscj4NRSNdd+xNj8DDA266P5ENL/6BtKfA2iMgUg/QfMYSDwFxWrVHwKp6s3mfBdjjKts/ur9GGB+YV80R8yfHYbGibXAhKp+xxiXGR9BenGs+b99yfxHnTb7oaRPjQwqdD6KxqOq3Ku69ijSw3r9vADNM55F9j63ni8KuqLkErT7a1nWh5wfVds7GcUtub3X0MDGccURr5Ifue35xpOh1PfNoDn2W8LGF5fXs73cT+bl+npf/WwYmq+fiXZ75XKHvn1zXfyfSEbFGJtR8LEbOl97GXIkT2fFlqDAqJ+98vtnAgoyHN1o9fk/0aB8Kgouc/wVCgZGxBh3RsGUB10bSCNNODpIE+JuYH2M8Wi0qtKOMqj9kYMIWR0r0OqhH2mZjCYgOyLD8aOENSzJ2vW28+98l00zChBGIYezjDT575fR0VHhT29Wl/cnx7dQYDEDDYIv2ucXosmn7y5wnh5gdc5Ggcq/oYF6hT0/AGWtn6Q22PJ+DbW/y9CqhR9j6EADz+GknSE7WR2rrY++gr8zCg697sfQxGEByhzfaf3cD63AeGLqHOSMfm+89ONRn0RG6knII42+ldmzOyHd8l02a1GQNcrayvXA3/vxgXtJycZ5pCNVtxp/u5DOLzK++ln3ISgwXGH9f5y0u2GEfTa40uZiUhJsgf39FBoEnF+91v+qnngyLJKSdZ6IW430/hyU6AxZnyfY86vRwOVHtALSr9lIVzvQSognG/14i08sg/XtcePL2ch23DdGYJnZsO/ai2jFrMv69GbgT0iJj5VooDrE6tgRbdUFHcn4GVp5mUPaddZi/bkM2fGPSZPSwda/f0O7D1w/miq8/LrRNsNoXZXV/WukDzsi/xCzl8MHQj+K1U467vY1ZCuOu+zZOWhA9uTAHmhg6ofOzTtN7aTE3BeNT36s6Fm0Uup3JtyMbHoJ0r3l9vfH1s/8rsM8wXqR9fEVlOg52niwgHQcaB1K1vyX1XUz0oeF9tkwlGwMpJ0QviPuGyggOd/qG05Krj6LbHkQCjDb0MQQtEL3pRjjUNIR2cDGA74HoP+JEgVrkJx9h944lHT2HzvwowCgZBukxQ2MFj+vPw35BD9i+hYks7ko6eTH2AaTdnn5wsQNSBYXGl0tSDdXo4nhIUbHtVbHfKvXx9w9jXd7okTdIShh1R/5Sr8XbzGaYByBdGO60T4DBTcg+1yR9TEAF8YYjzZ5n4zG7suQHo5DgeVSFIg9icbQ/dGkYgma2INs2HXwDpQMOxj5qFVG63ir6wD77vvW95NRksiD3GBt7Bh0P9x9yDbWWv8GINvZidqjlgdR69edN0tQInC+0TKD+nbQiWxpOtKRC40va+z9DLTQ0onuEtqJdOx9OpqUfwWt5u6dtXGplWtGOtlK2jHux193RYmDfsgmPHF3AZpo9QP2N34E5L++h3zhohjjfehCVPejVxiv9yTtAvkq0sn9kI8/GMVaa5BcPenjd+DNN3mcgGK1D1l7H0D6cDVpMQPr16n2zH72TDU+2tvKf8z++vG509BY5eUGkBbYhhnP34zs+TfIlputT7OwSWvQHWMDqI0pnW8g2/im/R0UYxxgNN5lz33JvnsTShIdi+KdPU0uB4cQTgj69btzSDuK+wGftKRbNUGxO2k8Owf5qt2R/R+MEiDTvL4QgvcloN1TzVm7nshaHGPc3/ziaONPf/tuF9KPrPg9rB1Wlyd5D40xDs2e/3KM0ReS3bcejHxdNNo6vc3MH++AbLIVLb6NDyEcj+LCTuRDfEz6IrKhc5DP/35W1zo0JhyM7OJUK9dstPvE+U+Qfzw4k+mRaGdDTtt6oL+VuQMl+K9AY/GYapshhI+EEPpZm4OwZFhVBqS7s86xPq+MMS4kwelqtXLj7f6dtyGfldfVgnzfZ9B8qosUJ2zoJ7o/ZyjyGc+jH/4Ya7R9GNjZdGN/kg69B+mkzyc20GY6dLzxf/86MhgCnJG1ORfttH2mTpt7IB8yHsUrfrrkLHtuQCanPVxOaIw8xtr8JtLdt6LdLUcjX/Z8jLGa0MrRDyWcQWNbD2knGsgfB9JmB5eZYwfS2Oi2XA8HhxD+zPp+Dlr42hXJ7hxkH87bg1Gs+bMY47719DuE4HepnoNiHd95dk5M9yTVa7M/sKyODDb4oqBfrt2RitzZWNfei3xOtZ8fCrpr9xykjxNjjKuqdmDJkVvRuH1PCOHzJB++gR/I9s5DtncWkumQCj9cXsNIx+/yuMLtfbP4Qa3tXYzGjXMrvD07q2uu8WxZlR9sbHu5n8z5lvuiuWyMqn74OFqVO5vwzfWxqa1Tb4QXCjrehiYWvh1zJcp6P2zMfAkFYV32XSvpmN4d9v0gFKyvAn5k341Fv3BwLAqOfEvnV1GAdywKsCOawCy3ugejge5PUbJkHtqGeAppsrkrCvRWoKC1GwU+L5OO6V1BusdhEuli7C6U5LkHOcqrSUFjM1LKh62dO42GJrRC9hJKoCxEA+8oe2ahlRmPJtcHkO5VWINW5NajYNiPvz2EBuI2e64bTfr+lbT93PtyqfV9ufHOE0uzjQc9yAn5EbTb7e9klNw5zng73+j5FEoaTbDnX0QG6cHnZNIv/PiEs4eUgOqwch32ud8DFdGk7moU/P6J8cyPM/lxjB5kiJ802byY1bXMaPQjWt32epR0hKjXyvkdW2tRMLTA5LOEdLzGk4d+1M/lvdL45IkYP5rqk8ZRpHvJIunOHL+vqNVkuBTt6ulGE7pp9momnaM/EE0C15KOX/SgScIw0t0g7yBNsG8i3TW20D5fTbrDZzVyyq+Qjle+QjrK0Ymc4gpkl/eRJm2uOzfa/34nxX+R7v5ajZJVd6FV9N9Yu18gHVtbh+xvPumY3jfs+/vt+0uy8r8m6VOn8eGxTJ6rkV40mSwWGw+7kT1H0hGidpPBCmt/b3t+rbXTY220ID2PVueUjMetyM+sQfpwtn3/Jav/StK9cL7q1GVt9pKO0np93aTEXBdKmPudKD3ZayTSEb+Dxndn+N1Lj5qs5qKdOq1Wxyj7v5N0r8wzKCE5lHTk5QdIl/xY3Kko2fcCsr/TrdwCFMD4fTp+H4nz6lvGm2+hiYfvEHuF2rvb3kE6gvay9f0PKCDpzMr5kdAxpGSwb43uQf7Oj+kNt9cU0s62V4xv3vbTpGOro1GwMA3pwzrkW543GvZBCyWe2Pak4TIUvEekP34st9nousm+m0k6/vsICs7vR2OjbxmfYX24jrSL8kaS/i+zPvzEZLXKysxCu1qfyfjVhhYIZplsplrfxpJ+de8K0tGJCWiMOs6++6KVHWDv/9Sefxz5rDFoddrt+AGkH2OQX77LXi8gv3aU1RPQ7r9O0v1mY1AMMIY0LkyxesegCe1aq6uJNBZdazR1IN/eaa+59v53yA4uRb7M+bjEZOXj2QrS8fd3UN8O/JjeL1DycSzJDmajeMTvJPsG2vHkOnG9yedw0u7jhSa7aVbOj1MvtHrcDy9GCSC/W2eq8cB/GessUvL6WtIRut3QTqV1Vn4u6YjpVJJPaTN+vkDylUeiwD/3WVPt+4et7EtG37WkI+9+HDham7OwXy1Fvv0+a7PJnltHur/sdmT7i6wPK9ECQY/x15PE/47swI85rrT2/F4YXwn/HUpu/6X1o4l0N8yZpGN6Lcg/dFh/dkMLXFeiWLIDxalT7a/LexhasZ5H8n8rrF9+J8lkkn20WPsTUfJjfyvXY210Gg8GocSN83+B/e828hDJDm5E+jnV2n3Zyr+YxegjSfb/O1K8tRTpZieS/40oUVX91bSR1B7TW492Q00lHf8cU+c5t+2RJJv0eLAV6fgMpGtTSUepV6Mk/cCsLj+2MxUlAf1OscnGq9zPfNTKeQK5y8rOI/k9v5fzWntunsl0AjCsTpv3VWgb14cMlpKOVF9ldfv8wvVjKunS4zXGx3p1uR69iMbQev28FtndGBR/vZDRdpvx/FpSsn2Kycx3Yy9ACcFch16m8ovoFX7kbT6KdLpem7PsuRnW3nSTyY1oQ0E7KS50XzQP+ZZ97Nnz0fztJ0h35hr/34mNKXXmpUNNVg8anX5X4x7IftqRH3iJdDfTGLQAMpZ0bcJa+95t76JMnjnfHrM625AOdSN9mmTfLzZ++Bg12b57pg5vn8vqeqRS1yTk/+q1+UwDGeS+aG4fcs91bUyDft5GsoPbTfb17GAa6f5W94H1+DHFPp9j9RxXhx9jjGfdpFMr9exgc/mR214kbQjIeftUVtfTaLG2Hj+qtvfrBuWqvqgN2WPVL+T64QvNTttldfR8JOXOqA3MOM2E7GfCj0MrQdOxnxlGCRW/c+cXKGj1CxtH2qvJXk+i4GI96XjMALSC6ZP8063+DtLkfjZKyLRZO34UZSi1l4LfRgrC/H6OSaTLkP3+ol5qL4/14H6xvXqQ0T5DSix50mMZ6QLdajJqmSmeX3y6wMr5Lq15wP8YrY9kdP2QtLvEV5B94tmOJjpzkPP0y0h9Mh1J9yDln3kAtyJ7vxpNnn6KHI0Hjk+iVbALSMmGZ5GxrTM+XEfaeTXJ6PAkW5PVNcjqWEtKirjTWk7aZTQNDTjzsn52G203Z31wx3gVGkA90dRGuvelM5OZl/dEgN/JFLM2VyI981X1VaTLXn1C7HePtJMuifYEm6/E92avXOc6SPc0dWevl0wWa5GzvY50ybm302k89eDFk11rSDttfMLVabT7BO0Rau938lcH0sEW0qQiUkv3aOR0fSBz2vMEgfdtldUzPeObDyLTTKZdlfp7gS7Te78sfDbJF7i8Z2V9XImOlXgfvR63o1lWh/PD6b2DlIxqQoOIP9tLSlo8k9HYTNoBtjajqZe0Q3Icss9jkT32oiM2LsNO0p1rrjttJHt1XrrcPcnpcnoe+c4lyL7c736YlGzqMX6vRitTy40HbdTq/nyTjycd/Q4uv8D3f0ze3o851qYfNV1KuqT+/aRkor9Wol1qfpzD5fI4SY9vpTZR64m29fbsk6Qdnn4v3YMZL52Hfp7/KZP5Xvasy3QdCjCbjR63pXVojBiJgpCcp1dZn6dYvfOtrB+V7kFHMBeQds+6HTaRdrmsIQWXQ1Hyxn1RO9KxPdDRCh/T1qCdETcZr5uRz2tHOwCeIPnEJjTm7o0mTE0mGw/mLyIl+X0C+j40EZ6HdG/DRA0lGjwg7Gd0eUD3JAqs9rH/3ef7rqDTSHeRTfR6GsQNl5P8/wLSBaeeqOqgNjgdSu1dC6NJ904MNfr2Iv1AxlMoETMSTWbuI43RL1s/fdx0PzUXBZ7Ow6odrLO/fmxxEckO1prMPk9KlHsdy0jjeI+V/Tzp7jTfqTyJlFB2X7qe5HNuJ90LVuV1DzCrjtxmkGIVPwbXUZHb10h6t8BeQ9FYdJHVOcTKn2/1TjYez8jk+UtknzOBL2TJizsrsl+TyW0R0im3vVa0I+gR0hgyjWSH77bnpplMl6EY6yW0k6IJLZRNIvm+95Hu6WoCfpjRMps0Nv6ENIGZRpqwzEA+d73xrZEMziDT0Uqfh6EJ0iy0IDCwnl7XmVDX/a6vcnXej0SLI00kWzmf2hj87Kp8GrQ1HE3m/4B06VrST6xXk1FPkXTyo2j8moyOP4N2pPyClHS789X0c0vKbc26SpuvrgxK9DahGLfdPtugj/a+hXRfU66rh6JNAJOR39hogp7RMAONUxPR8c/c7zRlZXJbuZx0Z9QbVgb/12h7I/BjS16vWcWvt9e2YOY27s9wUkIooNW5v6tTbiDpcr8TgEnbm/byemO98kF5K9c7kj4y6tSZRLwGNFxO9gslm/nM687XoBX/Ziwo/2Npfr0MXNuAX5vdB9fDLe331uTbaymDLdWV7UHbH9NmX8+8VnbwerOd7aGTr5Uv+mN8+Wvdh23d5vawg9eStgb1bLY/3h46ubX5Udrc/m3+Ma8tpfP1zI/Xa5uvZ9reCPzYklewhgr+lyGEMBytzHehs6LPo5W+dZVyh6Gz4/3Q6tqXY4wTty212xYhhAvQMckcj8cY/7Ze+deg/dvQ6kiOb8YY790W7W9r2CV1x8UYl27lekeiZNPoBt8PR5OL0+t9v5VouBytVl39WrXxWiOEMAJt7b44xjhqe9PzRsO20MOCgoItwxvBlxdsGsUfFxQUFPzvQklG9YEQwnfQWf4co9C23wfrPHJq1I38p6Gz/Dnmxhg/Xql/777qycr9DP06WI6fxBhvaJR4QdvYN6I9xvj9Cg11+1gtl5V/Gu22yvHZGOMLWZl3oWNFOTrQ9vKN+mF/GyaPGtUXY3zv5tL/avvZF+rUtRc6pvlKTj8by2Bzy21OH/ZCF/S2ZfVtdpvVxNxm6tFQ0q/y+aX0C9DxgJ1ebZsN+uXY0P9NJRft+6vRBa2OXmBajPFd9dq05/KkofPTj1s1pLmRDaCjSJuis88kaR1d3xUdV5qZffaqfIn99e8OI13OPgsdj6mx3y1BRv9QEo96Ef1bnJANISxn418u/VqM8ReVcpv0U32VQz8bXU0oX4OOkuQ4mNoLRaFOP7ckQd3XeLK5/dxE/Zvj07c4wf5q69gafXst6Ho9YEv1ewvtYFP67T4m96WfjTG+sDV53agutAt0my9GbWmMUfT91aGvuHB70FNQ0Ahbc/5RUPBGRElGFRQUFBQUFBQUFBQUFBQUFBRsM/TbdJGCgoKCgoKCgoKCgoKCgoKCgoKtg5KMKigoKCgoKCgoKCgoKCgoKCjYZijJqIKCgoKCgoKC7YgQwkUhhGkhhN9tb1oKCgoKCgoKCrYFyp1RBQUFBQUFBQXbESGE6cBHY4xztzctBQUFBQUFBQXbAmVnVEFBQUFBQUHBdkII4VpgGHBHCOE7IYTrQwgTQwjPhxDOtDJDQwiPhRCes9eJ25fqgoKCgoKCgoItQ9kZVVBQUFBQUFCwHRFCaAGOAy4GpsYYbwwh7AFMAI4BItAbY2wPIRwG3BRjPG67EVxQUFBQUFBQsIUoyaiCgoKCgoKCgu2ILBl1D7AT0G1f7QWcBiwAfgocDfQAh8cYB217SgsKCgoKCgoKtg76b28CCgoKCgoKCgoKAAjAX8QYZ9R8GMLlwCvAUeiKhfZtT1pBQUFBQUFBwdZDuTOqoKCgoKCgoOD1gXuBC0MIASCEcIx9vjuwMMbYC3wW2GE70VdQUFBQUFBQsFVQklEFBQUFBQUFBa8PfA8YAEwJITTZe4CfA+eFEJ4CDgfWbif6CgoKCgoKCgq2CsqdUQUFBQUFBQUFBQUFBQUFBQUF2wxlZ1RBQUFBQUFBQUFBQUFBQUFBwTZDSUYVFBQUFBQUFBQUFBQUFBQUFGwzlGRUQUFBQUFBQUFBQUFBQUFBQcE2Q0lGFRQUFBQUFBQUFBQUFBQUFBRsM5RkVEFBQUFBQUFBQUFBQUFBQUHBNkNJRhUUFBQUFBQUFBQUFBQUFBQUbDOUZFRBQUFBQUFBQUFBQUFBQUFBwTZDSUYVFBQUFBQUFBQUFBQUFBQUFGwz/H93rSmym2sG2QAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x7f1aed9d10f0>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=[20,10])\n",
    "sns.barplot(x = 'fea', y ='imp',data = imp)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 预测结果分析\n",
    "\n",
    "除了对特征进行分析,我们再来分析每个类的预测结果,\n",
    "- 我们发现0,4,6,8的recall很差,也就是说很多都没预测出来,可能需要通过很多其他的手段对其进行处理.\n",
    "- 至于为什么预测不好,是不是特征没提好,还是参数不行,还是其他原因,希望大家自行探索."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "pred = lgb_model.predict(valid_x) \n",
    "df_analysis = pd.DataFrame()\n",
    "df_analysis['sid']   = data[valid_index]['sid']\n",
    "df_analysis['label'] = valid_y.values\n",
    "df_analysis['pred']  = pred "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "df_analysis['label'] = df_analysis['label'].astype(int)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0.08741402158137186 0.3445328617742411 0.9389558232931727 0.2109727486013355\n",
      "1 0.1446172777181801 0.6844648581407587 0.6095960456792228 0.7802988982218828\n",
      "2 0.313324288508866 0.9012450882923827 0.8504355595264687 0.9585116560092644\n",
      "3 0.04459834669022528 0.08536585365853659 0.4602076124567474 0.04704633887513265\n",
      "4 0.024452577774973182 0.017687934301958308 0.42424242424242425 0.00903225806451613\n",
      "5 0.0975736732504575 0.8408789264120337 0.7729429307306493 0.9219078415521422\n",
      "6 0.01989335520918786 0.16304347826086954 0.34177215189873417 0.1070578905630452\n",
      "7 0.17792011106203068 0.7875073199297286 0.7034944549068843 0.894307501330023\n",
      "8 0.004559222565785322 0.2505307855626327 0.3241758241758242 0.2041522491349481\n",
      "9 0.04993058623083233 0.5150723122495208 0.5742035742035742 0.4669826224328594\n",
      "10 0.028475421215371995 0.5524308865586273 0.4847344207444584 0.6421052631578947\n",
      "11 0.007241118192717865 0.4618784530386741 0.46860986547085204 0.4553376906318083\n",
      "0.6870639391098695\n"
     ]
    }
   ],
   "source": [
    "from sklearn.metrics import accuracy_score\n",
    "from sklearn.metrics import accuracy_score,recall_score,precision_score\n",
    "dic_ = df_analysis['label'].value_counts(normalize = True)\n",
    "def get_weighted_fscore(y_pred, y_true):\n",
    "    f_score = 0\n",
    "    for i in range(12):\n",
    "        yt = y_true == i\n",
    "        yp = y_pred == i\n",
    "        f_score += dic_[i] * f1_score(y_true=yt, y_pred= yp)\n",
    "        print(i,dic_[i],f1_score(y_true=yt, y_pred= yp), precision_score(y_true=yt, y_pred= yp),recall_score(y_true=yt, y_pred= yp))\n",
    "    print(f_score)\n",
    "get_weighted_fscore(y_true =df_analysis['label'] , y_pred = df_analysis['pred'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型训练&提交\n",
    "\n",
    "我们使用上面的最优的迭代次数作为我们模型的迭代次数进行线上结果的提交,该方案线上的成绩应该在0.680-0.690之间,具体多少分欢迎有兴趣的同学自自己提交。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "all_train_x              = data[data.req_time < '2018-12-01'][feature].reset_index(drop=True)\n",
    "all_train_y              = data[data.req_time < '2018-12-01'].click_mode.reset_index(drop=True)\n",
    "print(lgb_model.best_iteration_)\n",
    "lgb_model.n_estimators   = lgb_model.best_iteration_\n",
    "lgb_model.fit(all_train_x, all_train_y,categorical_feature=cate_feature)\n",
    "print('fit over')\n",
    "result                   = pd.DataFrame()\n",
    "result['sid']            = data[test_index]['sid']\n",
    "result['recommend_mode'] = lgb_model.predict(test_x)\n",
    "result['recommend_mode'] = result['recommend_mode'].astype(int)\n",
    "print(len(result))\n",
    "print(result['recommend_mode'].value_counts())\n",
    "result[['sid', 'recommend_mode']].to_csv(path + '/sub/baseline.csv', index=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 其他开源代码\n",
    "\n",
    "下面是两份非常不错的开源代码,有兴趣的同学可以去看看.\n",
    "\n",
    "1. https://github.com/yaoxuefeng6/Paddle_baseline_KDD2019\n",
    "2. https://github.com/jiuxianghedonglu/Context-Aware-Multi-Modal-Transportation-Recommendation/tree/master/code"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 小结\n",
    "\n",
    "本篇文章,我们基于上一篇的EDA,给出了我们目前方案中的最重要方案之一 --- **多分类模型部分**,本文给出了我们多分类方案的整体框架,部分重要的基础特征工程,以及较为一致的线下验证方案与结果分析方案,通过增加特征等方案,该方案可以取得线上0.697左右的成绩。\n",
    "\n",
    "最后感谢两位大佬的开源以及有夕的代码框架！祝大家比赛愉快,不管结果如何,我们的方案会在比赛结束后和大家一起分享。\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "349px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
